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ABSTRACT  ■  .  / 

Time  series  models  with  autoregressive ,  moving  average  and  mixed 
autoregressive-moving  average  correlation  structure  and  with  symmetric, 
heavy-tailed,   non-normal  marginal   distributions,   called  Jl -Laplace,   are 
considered . 

First,   a  flexible  mixed  model  NLARMA(p,q)  with  Laplace    (double 
exponential)  marginals  is  investigated.      The   correlation  structure  for 
several   special   cases   is  derived.     The  innovation  sequence  for  the 
second-order  autoregressive   case,    NLAR(2),   is   derived.      Parameter 
estimation  in  the  NLAR( 1 )    models  is  discussed  in  terms  of  moments,    least 
squares   and  maximum  likelihood. 

Second,   a  family  of   continuous   random   coefficient  models  with 
2,-Laplace   distributions   are  examined.      The    Sl-Laplace   distribution  is 
described  along  with  a   useful   transformation.      The   correlation  structure 
for  special   cases   is  derived.      For   a  special    case  when   i   is  one,    the 
BELAR(l)  model  with  Laplace  marginals,   the  maximum  likelihood  estimator 
of   serial   correlation  is  derived.      Least  squares   estimates   are  also 
derived   using  the   concept   of   a  linear  residual.      These  estimators   of 
correlation,   along  with  other   estimators   of   location  and  scale  are 
compared  in  a  small   simulation  study. 

Thirdly,   the  NLAR( 1 )    and  the   BELAR(I)    processes   are  compared  using 

higher   order  residual   analyses   based  on  the   uncorrelated ,    but   dependent 

linear   residuals,    {R   }. 

n 

Finally,   open   problems,   as   well   as   possible  extensions   and 
applications   of  the   analyses   given  in  this  thesis   are   discussed. 
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I.       INTRODUCTION 

In    standard    time    series    analysis,    one    assumes. the   marginal 

distributions  of   {X   }  are  Normal,   i.e.   Gaussian.      However,    a   Gaussian 

n 

distribution  will  not  always  be  appropriate.  In  earlier  works  by  Gaver 
and  Lewis  [Ref.  1];  Jacobs  and  Lewis  [Refs.  2,3];  and  Lawrance  and  Lewis 
[Refs.  4,5,6],  stationary  non-Gaussian  time  series  models  were  developed 
for   variables   with  positive   and  highly  skewed  marginal   distributions. 

There  still  remain  other  situations  for  which  Gaussian  marginals  are 
inappropriate,  i.e.  the  marginal  time  series  variable  being  modelled, 
although  not  skewed  or  inherently  positive  valued,  has  a  large  kurtosis 
or  long-tailed  distribution.  The  position  errors  in  a  large  navigation 
system  have  such  a  distribution.  In  particular,  Hsu  [Ref.  7]  modelled 
pooled  position  errors  using  the  double  exponential  distribution  (also 
called  the  Laplace  distribution).  Also  McGill  [Ref.  8]  showed  that  the 
Laplace  distribution  provides  a  characterization  of  the  error  in  a 
timing  device  under  periodic  excitation.  Speech-waves  are  modelled 
using  Laplace  variables  (Davenport  [Ref.  9]).  In  the  "speech-like" 
process   given  by  the  linear  AR(  1 )   model 


X„-CX^.,    -    (1    -    o')^'2^„_  (I.l) 


where  .8  <  c  ^  .9,  the  innovation  sequence  {E  }  is  i.i.d.  Laplace  (Linde 

n 

and  Gray   [Ref.    10]).      In  image   coding  systems    using  a    two-dimensional 
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discrete  cosine,  DC,  transform,  Reininger  and  Gibson  [Ref.  11]  showed 
that  the  Laplace  distribution  gives  the  best  approximation  to  the 
distribution  of  the  non-DC  coefficients.  Recently  Sethia  and  Anderson 
[Ref.  12]  required  a  stationary  autoregressi ve  process  with  Laplace 
marginals  in  their  research  in  communications   technology. 

Even  before  Gaver  and  Lewis  [Ref.  1]  wrote  the  pioneering  paper  on 
the  subject  of  autoregressi ve  processes  with  a  specified  non-Normal 
marginal  distribution,  Gastwirth  and  Wolff  [Ref. 13]  had  derived  a 
solution  to  the  linear   additive  first-order   difference  equation 


X„   -    PX„.,    *    E^,  (1.2) 


for  which  {X  }  is  marginally  Laplace.  This  result  was  used  later  by 
Gastwirth  and  Rubin  [Ref. 14]  within  the  context  of  robust  estimation  on 
dependent  data.  This  solution  to  (1.2)  is  here  called  the  Laplace 
First-order  Autoregressi ve  Process  (LAR(1)).  The  early  solution  of 
(1.2)  is  mentioned  at  this  point,  merely  to  further  substantiate  the 
claim  that   non-Normal,   heavy-tailed  distributions   are  of    interest. 

In  this  thesis,  several  time  series  models  with  a  specified 
symmetric,  heavy-tailed  marginal  distribution  are  presented.  This 
distribution,  called  the  2,-Laplace  distribution,  includes  the  Laplace 
distribution  as  a  special  case.  The  approach  in  Chapter  II  extends  the 
discrete  random  coefficient  model  of  Lawrance  and  Lewis  [Ref.  6],  New 
Exponential  Autor egr essi ve  Moving  Average--NEARMA( p ,q ) ,  to  the  case 
where    the    marginal    distribution    is    Laplace,    also   called    double 
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Exponential.       This    class    of    models    is    called    The    New    Laplace 

Autoregressive  Moving  Average  model,    NLAR^4A(p  ,q) .      Several   special   cases 

of    NLARMA(p,q)    are    individually    researched.       The    second-order 

autoregressive  model,    NLAR(2),    is  established  by  showing  the   conditions 

for   existence  and  uniqueness   and  by  specifying  the  innovation  structure. 

The    correlation   structure   of    NLAR( 2)    is  also  given  along  with  results 

concerning  directional  moments   and  partial   time  reversibility. 

For    the    case  when  p   =  1    and  q  =  0,    called  NLAR( 1 ) ,   the   distribution 

of   the  difference  X  -X     ,    is  derived,   providing   some   insight    into   the 

n     n-1 

nature   of    the   differenced  NLAR(1)   model.      The   conditional   density  of   X 

given  X  ,  is  also  derived,  which  leads  to  a  brief  investigation  of  the 
^  n-1 

likelihood  function.  Parameter  estimation  in  NLAR( 1 ) ,  however,  is 
limited  to  comparisons  of  the  moment  estimators  and  the  least  squares 
estimators  for   the   independent  model    parameters   of   serial    correlation. 

The  correlation  structure  is  derived  for  other  models  in  the 
NLARMA(p,q)  family:  the  first-order  moving  average  called  NLMA(1);  the 
first-order  mixed  model  called  NLARMA(1,1);  and  the  special  cases  of 
p  -order  autoregressive  models  called  TLAR(p)  that  are  analogous  to  the 
TEAR(p)  model  of  Lawrance  and  Lewis  [Ref .  6].  These  models  demonstrate 
the  flexibility  of   the  NLARMA(p,q)   family. 

In  Chapter  III,  a  family  of  stationary  time  series  is  developed 
using  continuous  random  coefficients  in  the  additive  difference  equation 
model.  The  marginal  distribution  is  specified  to  be  a  member  of  the  so- 
called   Jl-Laplace   distributions,    the   properties   of   which  are  described  at 
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the   beginning  of   the   chapter.      The   "square-root   Beta-Laplace"    transform 
is  defined.      It  is  used  to  formulate  the   2, -Laplace  time  series  models. 

For  the  special  case  when  11  =  1,  the  marginal  distribution  is  again 
Laplace.  The  aut  oregressi  ve  model  is  called  the  Beta-Laplace  First- 
Order  Autoregressive  model,  BELAR(l).  The  conditional  density  of  X 
given  X^_^  is  derived.  This  leads  to  the  derivation  of  a  likelihood 
function  and  a  numerical  technique  to  evaluate  and  maximize  the 
likelihood  function  with  respect  to  the  model  parameter  for  serial 
correlation. 

Several  facets  of  the  parameter  estimation  problem  are  investigated 
for  BELAR(I).  The  behavior  of  different  estimators  of  scale  and 
location  are  compared  using  the  Simulation  Testbed  (SIMTBED)  of  Lewis, 
Orav  and  Uribe  [Ref.  15].  The  least  squares  estimation  theory  is 
derived  around  the  concept  of  a  linearized  residual.  Asymptotic 
properties  are  derived  using  results  from  Nicholls  and  Quinn  [Ref.  16]. 
Robust  estimators  are  defined  and  simulated  in  SIMTBED.  Finally,  a 
numerical  scheme  for  finding  the  maximum  likelihood  estimator  of  serial 
correlation  is  used  in  a  small  simulation  study  of  the  small  sample 
properties   of   the  maximum  likelihood  estimator. 

In  the  last  section  of  Chapter  III,  a  first-order  moving  average 
model  is  discussed.  A  q  -order  moving  average  model  in  i-Laplace 
variables   is  also  derived. 

The  random  coefficient  approaches  are  not  the  only  ways  to  generate 
Laplace  or  other  variables  with  a  specified  correlation  structure.  The 
literature  contains   numerous   articles   on  generation  of   random   sequences. 
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One  approach  put  forth  in  several  papers  (Gujar  and  Kavanagh  [Ref.  17]; 
Haddad  and  Valisalo  [Ref.  18];  Li  and  Hammond  [Ref.  19];  Liu  and  Munson 
[Ref.  20];  Sondhi  [Ref.  21])  involves  passing  white  Gaussian  noise 
through  a  linear  filter  followed  by  a  zero  memory  nonlinear  transform. 
This  is  a  general  procedure  that  produces  exactly  the  required  marginal 
distribution  and  a  good  approximation  to  the  autocorrelation  structure. 
However,  the  scheme  lacks  the  simplicity  of  either  of  the  methods  being 
proposed.  Moreover,  the  filtering  approach  produces,  for  example,  in 
the  first-order   autoregressive   case,   only  one  process. 

It  is  important  to  note  that  in  non-Normal  time  series,  there  are 
infinitely  many  processes  with  a  given  marginal  and  autocorrelation 
structure.  This  is  the  case,  for  example,  in  the  two-parameter  NLAR(1) 
process.  The  differences  in  these  processes  must  be  explored  through 
higher  joint  moments.  In  Chapter  IV,  residual  analyses  using  fourth 
joint  moments  are  derived.  The  ideas  are  modifications  of  those  from 
Lawrance  and  Lewis  [Refs.  5,  22],  who  accomplished  an  analysis  using 
joint  third  moments  within  the  NEAR  framework.  The  residual  analysis  is 
applied  to  show  the  differences  in  the  various  NLAR( 1 )  processes  and  the 
BELAR(I)    process. 

In  Chapter  V,  open  problems  and  possible  extensions  of  the  analyses 
given  in  this  thesis  are  discussed.  Possible  applications  to  the 
analysis  of   wind  velocity  data  are  detailed. 
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II.      DISCRETE    RANDOM  COEFFICIENT    MODELS   WITH  LAPLACE    MARGINALS 

A.      INTRODUCTION 

Two  aspects  of  modelling  with  dependent  random  variables  are  widely 
studied — the  marginal  distribution  and  the  correlation  structure.  It  is 
widely  known  how  to  generate  sequences  with  either  a  specified  marginal 
distribution  or  a  particular  correlation  structure.  Transforming  the 
random  variables  may  have  an  undesirable  and  unknown  effect  on  the 
correlation  structure.  Likewise,  the  marginal  distribution  of  a 
filtered  process  may  be   unknown. 

It  is  the  generation  of  random  variables  with  both  a  specified 
marginal  and  a  specified  correlation  structure  that  is  discussed  in  this 
chapter.  Specifically,  we  want  sequences  with  a  Laplace  (double 
Exponential)  marginal  distribution  and  with  ARMA(p,q)  correlation 
structures  as  given  by  Box  and  Jenkins  [Ref.  231  for  the  usual  linear 
ARMA(p,q)   models. 

The  following  is  an  example  of  a  process  that  has  Laplace  marginals. 
Let    {X    }   be   a  binary   Markov   chain  with   transition   matrix    P,    so   that 

P[X^  =  OlX^_^=0]    =    a^,     P[X^=1|X^.^=0]    =    1-a^.    P[X,=  1  1  X^_,  =  l  ]    =  a^.    and 

X 

P[X   =0|X         =1]    =    l-a„.       Let    L      =    (-1)    "e^ ,    where    {E    }     is    an    i.i.d. 
n      '    n- 1  2  n  n  n 

Exponential  sequence.  If  a  =a^  =  a,  { L^ }  has  a  Laplace  marginal 
distribution.  However,  the  correlation  structure  is  not  that  of  an 
AR(1)    process.       It    is,    in   fact,    easy   to   see   that    Corr ( L^ , L^_^)    = 
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(1 /2) (2a-1) '     ' ,   for   k=±1,±2,...,   which  is  not   a  pure  geometric   function 
of   k. 

Two  processes  which  produce  an  AR(1)  correlation  structure  and  a 
Laplace  marginal  distribution  are  the  Laplace  Discrete  AR(1),  LDAR(1), 
which  is  an  adaption  of  the  DAR( 1 )  process  of  Jacobs  and  Lewis  [Ref .  2], 
and  the  linear  process  of  Gastwirth  and  Wolff  [Ref.  13],  called  the 
LAR(1)  process.  The  LDAR(  1 )  model  produces  an  {X  }  sequence  using  the 
first-order   autoregressive  equation  with  random  coefficients 


X      =   V  X      ,    +    (1-V    )L    ,  (II. A.I) 

n         n  n-1  n     n' 


where   {V   }    is  an   i.i.d.    sequence    of    Bernoulli    random    variables    with 

p{V   =1}    =    1-P{V   =0}    =    p;    {L    }    is  an  i.i.d.   sequence  of   Laplace  random 
n  n  ^  n 

variables.      The   coefficient    and    innovation    processes    from    time   n    are 

assumed  to  be   independent   of   X  _,  ,X  _„,....      This  sequence   produces   runs 

of   constant   value  when  successive  realizations  for  V     produces   the   value 

n  ^ 

1.  When  V  is  zero,  a  new  value  is  selected.  Although  LDAR(1)  is  of 
limited  value  in  general  application  because  of  this  runs  property,  it 
is  significant  in  that  it  is  one  of  the  first  in  a  series  of  more 
general  discrete  random  coefficient  equation  models  for  non-Normal  time 
series,  and  it  produces  a  first-order  autoregressive  Markovian  process 
for   any  specified  marginal   distribution. 

The  LAR(1)  model  turns  out  to  be  a  special  case  of  the  more  general 
process  called  the  New  Laplace  Autoregressive  Moving  Average  model, 
NLARMA(p,q).       Properties    of    the   LAR(  1 )    process   are   pointed  out    in  the 


26 


next  section  of  this  chapter,  which  gives  a  characterization  of  the 
Laplace   distribution. 

The  NLARMA(p,q)  model  is  a  very  useful  family  of  time  series  models 
that  are  discrete  random  coefficient  linear  difference  equations.  The 
models  are  extensions  of  the  NEARMA(p,q)  structure  of  Lawrance  and  Lewis 
[Refs.  4,5,6]  to  those  cases  where  the  underlying  marginal  distribution 
is  Laplace  rather  than  Exponential.  The  family  provides  great 
flexibility  to  systems  modelling,  because  of  the  broad  range  of 
correlations   and  different   dependency  structures   which  are  obtainable. 

Section  C  is  an  examination  of  the  second-order  aut ©regressive  model 
of  the  family,  NLAR(2),  for  p  =  2  and  q  =  0  in  NLAR^4A(p  ,q) .  Conditions 
for  the  existence  and  uniqueness  of  the  strictly  stationary  NLAR(2) 
model  are  derived  using  results  from  Nicholls  and  Quinn  [Ref .  16]  about 
Random  Coefficient  Aut or egr essi ve  models  of  order  k,  RCA(k).  In  a 
proof,  very  similar  to  that  given  by  Lawrance  and  Lewis  for  the  NEAR(2) 
model  [Ref.  6],  the  innovation  for  the  NLAR(2)  model  is  derived 
explicitly.  The  innovation  is  shown  to  be  a  convex  combination  of 
scaled  Laplace  variables.  The  correlation  structure  in  the  NLAR(2) 
model  is  shown  to  satisfy  the  Yule-Walker  type  equations  just  as  do  the 
linear  AR(2)  models.  Aspects  of  directionality  and  time  reversibility 
are  also  addressed. 

In  Section  D,  the  first-order  aut or egr essi ve  model,  NLAR(1),  is 
described.  It  is  a  two-parameter ,  first-order  Markov  process  which  is  a 
special  case  of  the  NLAR(2)  model.  The  distribution  of  differences  is 
derived.      The   conditional   density  of   X      given   X  and   the    likelihood 
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function  are  also  derived.  The  non-differentiability  of  the  likelihood 
function  for  all  values  of  the  two  parameters  has  prevented  the 
development  of  the  maximum  likelihood  estimators.  Parameter  estimation 
is  discussed  within  the  context  of  moment  estimators  and  least  squares, 
using  the  usual   linearized  residual. 

In  Section  E,  several  different  special  cases  of  NLARMA(p,q)  are 
formulated  and  briefly  discussed.  The  correlation  structure  for  a 
first-order  moving  average  model,  NLMA(l),  and  a  mixed  aut  or  egr  essi  ve 
moving  average  model,  NLARMA(1,1)  are  given.  Correlation  structure  is 
derived  and  parameter  estimation  is  discussed  for  the  general  p  -order 
aut  or  egr  ess  ive  models,    TLAR(p),   which  are  special    cases   of   the  NLAR(p). 

Each  of  these  models  in  Section  E  could  well  be  the  basis  for 
further  research.  The  intent  at  this  point  is  primarily  to  further 
substantiate  the  claim  of  wide  versatility  and  tractability  in  modelling 
non-Normal   time  series   within  the   context   of   the  NLARMA(p,q)   family. 

For   example,   the   bivariate  AR( 1 )    process   with  Exponential    marginal 
distributions   of  Dewald  and  Lewis   [Ref .    24],    can   be  extended  to  the   case 
where  the   marginal    distribution   is   Laplace.      This,    however,    is    not 
discussed  further   in  this   thesis. 
B.      CHARACTERIZATION   OF   THE    LAPLACE    DISTRIBUTION 

1 .      Properties   of   the  Laplace  Distribution 

The  Laplace  distribution  is  also  known  as  the  double  Exponential 
distribution.  In  general,  the  density  of  a  Laplace  distributed 
variable,    L,   has   two  parameters — a  location  parameter   -»   <   p   <  +<»,    and  a 


28 


scale    parameter    X    >    0.      The    parameter    y    is   fixed   here  at   zero.      For 
-oo  <  X  <   "  we  have 


fj^(x;A)    =  —  exp(-    lx|/A).  (II. B. 1.1) 


In  what  follows  we  will  define  {L  }  as  a  sequence  of  i.i.d. 

n 

random  variables  of  the  Laplace  distribution  with  A  =  1  (Standard 
Laplace).  The  characteristic  function  of  the  standard  Laplace  variable 
is 


<J),(u))    =  -T—: — 2    .        -»   <  w  <   »,  (II. B. 1.2) 

Li  1+0) 


and  we  have 


n. 


[O        if      n     is  odd, 
E(L")    =    j  (II. B. 1.3) 

[n!      if     n     is  even, 


so  that  E(L)  =  0,  Var(L)  =  2,  skewness  is  zero,  and  kurtosis  is  3.  The 
value  of  the  kurtosis  indicates  that  the  symmetric  Laplace  distribution 
has  heavier  tails  than  the  normal  distribution,  for  which  the  kurtosis 
is  0. 

The  sum  of  n  ^  2  i.i.d.  standard  Laplace  variables  can  be 
written  as  the  difference  of  two  i.i.d.  random  variables  Y  ,  Y^  with 
Gamma    distribution,    shape    parameter    k    =   n   and  scale  parameter    X   =   1. 
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This   follows    immediately   from    the    characteristic    function.       Let 

n 

Y   =      y      L. ;    then 
i  =  1      ' 


^Y^-)  =  i^b^r  -'  [i-h^r  [r^r  -  *y/-)  v^*  (ii.b.i.4) 


This  result  is  quickly  generalized.  Replacing  n  by  t  >  0,  we  see  that 
[<j).  (w)]      is   the   characteristic   function   for    the    variable  X    =    Y      ~    Y 

Li  I  ^ 

where  Y.  -  Gamma  (t,1),  i  =  1,2  and  Y.  and  Y_  are  independent.  This 
demonstrates   that  the  Laplace  distribution  is  infinitely  divisible. 

Another  useful  result  is  obtained  from  (II. B.I. 4)  when  n  =  1. 
It  shows  that  a  Laplace  variable  is  the  difference  of  two  i.i.d. 
exponential  (X  =  1)  variables.  This  makes  it  quite  simple  to  generate 
Laplace   distributed  variates   in  computer  simulations. 

Random  variables  with  a  standard  Laplace  distribution  are  self- 
decomposable.      Let 


(j)    (u))    =   (j).  (o) )/({),  (pw),  0   ^    p    <    1.  (II. B.I. 5; 

S  Li  Li 


According  to  Feller  [Ref .  25:  p.  588],  if  (p  (w)  is  the  transform  of  a 
random  variable  for  each  0  ^  p  <  1,  then  L  is  said  to  be  self- 
decomposable.      But   for  -"   <  w  <   " 


)^(a3)    =    {1    +    (pco)^}(1    +   o)^)"""     , 

=    {p    +    (1    -    p)(1    -    ia))"^}{p    +    (1    -    p)(1    +    iu))~^}  (II. B.I. 6) 
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p'    +    (1    -    p")(l    +  ui^)'\  (II. B.I. 7) 


We  recognize  (II.B.1.6)  as  the  product  of  the  characteristic  functions 
of  two  i.i.d.  innovation  variables,  e  and  -Cp,  as  described  in  the 
EAR(1)   process   in   [Ref.    1],      Also  from    (II. B.I. 7) 


,p.  p^ 


jo  w, 

e   =    I  (II. B.I. 8) 

L  w.p.  1    -    p^ . 


2.      The  Laplace  First-Order  Autoregressi ve   Process,    LAR(1) 

The  i.i.d.  sequence  (e  }  with  distribution  given  in  (II. B.I. 8) 
is  the  innovation  process  of  a  first-order  linear  autoregressi ve 
equation 


X      =    pX      ,    +    e    ,  (II. B. 2.1) 

n  n-1  n 


where    {X   }    is  a  stationary  time  series   with  double  exponential    marginal 
n 

distribution,  |p|<1.  This  is  the  LAR(l)  model.  It  is  actually  a 
rediscovery  in  light  of  the  fact  that  Gastwirth  and  Wolff  [Ref.  13]  had 
derived  it  earlier;  also,  Gastwirth  and  Rubin  [Ref.  14]  discuss  it 
within  the  context  of  robust  estimation  techniques.  The  present  account 
of   LAR(1)    includes   new   results. 

The  LAR(1)  model  has  the  same  properties  as  the  EAR(  1  )  model  in 
[Ref.  1]  with  two  important  differences.  First,  if  -1  <  p  <  0,  negative 
serial    correlations    for    odd    lags    are    obtained.       Secondly,     it    is 


31 


partially  time  reversible  in  the  sense  that  for  all  i   and  n,  both  of  the 
following  are  true: 


^(^nV.)  =  ^^Vn..)  =  °'  (II.B.2.2) 


PCX   >  X   J  =  P(X   <  X  ,)    =   1/2.  (II.B.2.3) 

n    n-1       n    n-1 


These   results    are   derived   in  Section   II. C      and  Section  II. D.      Note, 

however,   that  since  LAR(1)    is    a   linear    AR(1)    model    with  non-Gaussian 

innovation    {s    },    it    is    not    fully   time   reversible    (Weiss   [Ref.    26]). 

Also,   note  that   this  LAR(  1 )    model    has    the    zero   defect    property;    when 

e      =0,    then  X   /X      ,    =   p  and   p  can   be   determined  exactly  in  long  enough 

runs   of   the  series    {X   }.      This  property    is    generally   undesirable,    but 

the    broader  NLAR(2)   model    developed  in  the  next  section  is  free  of   this 

defect,   except   for   the  special    parameter   values   for  which   it   reduces    to 

the  LAR( 1)   model. 

If  no  repeats   are  observed  in  a  realization  of   the   time   series, 

an    extremely   efficient    estimator   of    p   for   LAR(  1 )    is  the  median  of   the 

ratio   X./X.     ,.      The    simulation    results    given    in    Table    II. B. 2.1 
11-1 

substantiate  this  claim.  In  Section  II. D. 4  and  again  in  III.E.5,  using 
the  framework  of  the  Simulation  Testbed  (SIMTBED)  [Ref.  15],  we  will  see 
that  this  median  ratio  is  for  small  samples  very  biased,  and  is, 
apparently,  asymptotically  biased  in  all  of  the  random  coefficient  AR( 1 ) 
models  with  a  Laplace  marginal   distribution  that   we  examine. 
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TABLE    II. B. 2.1 

Simulation  Results   using  Median    {X./X.    J   to  Estimate 

1   1-1 

p  in  the  LAR(l)  Process  for  Samples  of  Size  2000 


True  p 
-.9 

-.2 

-.1 

+  .01 


p  =  med  {X./X.  , } 
11-1 


-.9 


-.2 


-.08746 


+  .01986 


Comments 


,9  occurred  1586  times 
in  1999  ratios 

.2  occurred  75  times 
in  1999  ratios 

,  1  occurred  11  times 
in  1999  ratios 

01  never  occurred  in 
1999  ratios 


+  .5 


+  .75 


+  .5 


+  .75 


+  .5  occurred   490  times 
in   1999  ratios 

+  .75  occurred   1 1  49 

times   in    1999  ratios 


C.       A  SECOND  ORDER   A  UT  ORE  CRESS  IV  E   LAPLACE    TIME   SERIES    MODEL,    NLAR(  2) 
1 .      Introduction 

Using  the  terminology  from  [Ref.  6]  the  following  time  series 
model  called  NLAR(2),  New  Laplace  Second-order  Autoregressi ve  model  is 
proposed.  This  is  a  special  case  of  NLARMA(p,q)  model  with  p  =  2, 
q  =  0.  The  NLAR(2)  model  has  four  parameters,  double  exponential 
marginal  distribution  for  {X  },  second-order  autor egr essi ve  Markov 
dependence,   and  autocorrelations  satisfying  Yule-Walker  type  equations. 
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The  stationary  NLAR(2)  model  has  the  same  form  as  the  stationary 
NEAR(2)  model  in  [Ref .  6].  Writing  the  time  series  {X  }  in  the  form  of 
an  additive,  linear,  random  coefficient  autoregr essi ve  difference 
equation,  we  have  for   all   n  that 


^n-   ^1   ^;   Vl    ^   hK\-2   '   ^n' 


(II. C. 1.1) 


where    (K' ,    K"}    is    a   sequence   of    i.i.d.    discrete    bivariate    random 
n        n 

variables   with  distribution 


{K' ,   K"}    = 
n       n 


(1,0) 
(0,1) 
(0,0) 


w.p  .    a 
w.p .    a 


1' 
2' 


n=0,    ±1,    ±2,     ...    ; 


w.p.    1-   a    -Op, 


(II. C. 1.2) 


{e    }    is    an    i.i.d.    innovation   sequence   whose  distribution  is  given  in 

(II.C.2.4);    and    {e    }    and    {K',     K"}    are    mutually    independent    and 

independent    of    X       ,  ,X      ^ The    parameter    space    is    defined    by 

^  n- 1      n-2  '  ^  *' 

0    <    Ib.I     ^    1    and    0    $    a.    ^    1,    i    =    1,2;    a,    +   a^   ^    1.      Graphs    of    the 
'    1  '  1  12 

admissible  regions   in  the   parameter  space   and  the   correlation  space   are 

presented  in  Section  II. C. 3. 

Equations    (II. C. 1.1)    and    (II. C.I. 2)    have    a   direct    physical 

interpretation.      The   observed   process    at    time   n,    X    ,    is    only   one   of 

three  possibilities:      i)   X      is  some  multiple  of   what   it  was   at   time  n-1 , 

n 

6,X      .,   plus  some  random   noise    e    ;     ii)    X      is    some   multiple    (possibly 
1   n-1  n  n 

different    than    B-),    of    its    value    at    time   n-2,    BpX    _    ,    plus    some 
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independent   random  noise;    iii)   X      is  just  random  noise,    e    ,    independent 

n  n  ^ 

of   everything  up  to  time  n. 

2.      Existence   and  Uniqueness 

The  work  of  Nicholls  and  Quinn  [Ref.  16]  on  random  coefficient 
autoregressive  models  is  relevant  to  the  NLAR(2)  process.  They  have 
given  the  necessary  and  sufficient  conditions  for  the  existence  of  the 
unique  covariance  stationary  solution  to  the  following  class  of 
univariate  random  coefficient   autoregressive  models  of   order   p,    RCA(p): 


^n  '  j,    f^i   *^(l>lVl   *  ^n'  (n.C.2.1) 


n=0,    ±1,    ±2,    ...,   where 

a.  the   Y.'s  are  real   constants: 

1 

b.  (B    }    is    a    p-vector,    second-order  stationary,    independent   process 

with  E(B   )    =  0  and  constant   covariance  matrix; 
~n 

c.  {e    }    is    a   scalar,    second-order   stationary,    independent   process, 

independent   of    {B   },    with  E(e^)    =   o^   for   all   n. 

-n  n 

They  also  have  shown  that    if    {B   }    and    {e    }   are  i.i.d.   processes, 

then  the  solution    {Z   }    is  strictly  stationary  and  ergodic. 

Let    Y.    =    a. 6.    for    i    =    1,2  and   B   (1)    =   6,(K'    -  a,)    and   B   (2)    = 
111  n  1      n  1  n 

Bo(K"    -  a^) .      Then    (II. C. 1.1)   and   (II. C. 2.1)    have    the    same    form.      That 
2      n  2 

is  (II. C. 1.1)  is  an  RCA(2)  model  if  the  innovation  of  NLAR(2)  satisfies 
condition  (iii)  above.  Thus  applying  the  results  in  [Ref.  16:  p. 31  and 
p. 37],    there  exists  a  unique  strictly  stationary  and  ergodic  solution  to 

(II. C. 2.1)    for   Y.    and  B   (i)   as   defined  above,    if   and  only   if   all    of    the 

1  n 
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roots  of   the   characteristic  equation 


(t^   -  a^e^t   -   a2e|)(t^   -  a^Bp    =  0,  (II.C.2.2) 


are  within  the   unit  circle,    i.e.    iff   a  6^    +  '^p^p  '^    ^*      This  is  satisfied 

for  the   conditions   on  the   parameters   defining  NLAR(2),   thus   establishing 

the  existence  of   the  model    (II.C.1.1). 

No  marginal   distribution  is  ascribed  to  solutions   of   the   general 

RCA(p)    models    in    [Ref  .     16].        It    is,     in    fact,    determined    by    the 

independent    choices    of    the    innovation   and   the   random    coefficients. 

However,    by    specifying    the    marginal    distribution   and   the   random 

coefficients,   in  NLAR(2)    the   innovation  is  restricted  more   than    in   the 

RCA(p)    model.       If    the    X      in    (II.C.1.1)    or    Z      in    (II. C. 2.1)    have    a 

standard  Laplace  marginal    distribution,   then   all   their  moments   are  given 

by    (II. B.I. 3).      From    (II.C.1.1)    or    (II. C. 2.1),    it  follows   that   for   all 

P   =    1,2,... 

E(ef'')    =    {(2k)!}    [l    -    (a.B.^''    +  a,B^^^) 
n  '■  11  2   2 


"r^     ,,       „2(k-i)^  2(k-i)_,    2(k-i).  ..,_.  .,,,1    JII-C.2.2) 

L      Kct^B^  +  a2B2  ^^^^n  )/{(2i)!l|J    >   0, 

i  =  1 

and  for   this  to  be   true   it  is  necessary  that 


2k  ?k 

a^B^      +  a2B2     <    1-  (II.C.2.3) 
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Since    a,    and   a^  are    probabilities    it    is   necessary   that    |6.  |    ^   1    f 


or 


i   =   1,2  for    (II.C.2.3)    to  hold.      If  not,   there  exists    for    every    a      >   0 

and  ttp   >  0  an  integer  m ,   such  that   a  6^      or   a  6-     is  greater  than   1. 

We   have   now    established    the    necessary    conditions    on    the 

innovation    {e    },    and  on    B      and    Q   — namely  that    |6.|    ^    1,    i    -   1,2 — for 

the  existence  of   a  unique  strictly   stationary   solution   to    (II. C. 2.1) 

with   a   marginal    Laplace    distribution  and  with  the  random   coefficients 

given   by    (II. C.I. 2).      In  the   next    theorem,    we   show   that    |6.|     ^    1    for 

i    =    1,2    is    also   a   sufficient    condition    and   that    such   an    innovation 

random   variable  e      exists.      We    also   give    its    explicit    form--a    convex 
n  o  1- 

combination  of  Laplace  "random  variables.  For  simplicity,  the  parameter 
space   is  regarded  as   being  described  by  strict   inequalities   for 

THEOREM    1.       Let    {X    }    be    a   stationary    process    with  standard   Laplace 

n 

marginal  distribution.  For  all  n,  let  equations  (II. C. 1.1)  and 
(II. C.I. 2)  hold  with  0  <  |  S .  1  <  1  ,  0  <  a  ^^  <  1  for  i  =  1,2  and 
a.    +  Up   <    1 .      Then 


e      =    K     L      = 
n  n     n 


L       w.p 
n 


Ib^lL^      w.p 


l-p^-p^. 


P2' 


(II.C.2.U) 


l^sl^n     ""-P-  ^3' 


where    (L   }    are  i.i.d.   standard  Laplace   variates;    the  K    's  have   values   in 
n  n 

{1,     |b    |,    |b    |}    and   are    independent    of    {  L^ }    and    {K^,    KJ^}    for   all   n. 
They  are  also  independent   of   ^n-1'^n-2' Furthermore, 


37 


p^  =   {(a^B^'    ^  0L^&l)bl  -    (a^    +  a^)  6^    ^|^/(^2  "   ^3^^^    '   ^2^'        (II-C.2.5) 


p^   =   {(a^    +  a^)    B^    6|  -    (a^6^    +  a^Bpb^l/Cbl  -   b^)(1    -   b^)  ,      (II.C.2.6) 


1    >  b|   =  ^{s  +    (s^   -  4r)^^^}    >  b^  =  i{3  -    (s^  -  4r)^^^}    >  0,    (II.C.2.7) 


s   =    (1    -  a^)B^    +   (1    -  a^)^^^,   and  (II.C.2.8) 


r   =    (1    -  a,    -  a.)BfB^.  (II.C.2.9) 

1  ^      \    ^ 


Proof; 

For   the  NLAR(2)   model   specified    by    (II. C. 1.1),    (II.C.1.2)    and 

(II.C.2.4)    -    (II.C.2.9),    let    '^^im)    and    ()>    ( uj )    be    the    characteristic 

functions   of   the    {X   }   and   {e    }   sequences.      If   {X   }    is  stationary,   then 

n  n  n 


()^(w)    =   (})    ((jj){a^  4)^(6^0))    +  a^<i>^{B^ui)    +    (l-a^-a^)).  (II. C. 2. 10) 


Assuming  a  standard  Laplace  marginal   distribution  for    {X    },    the 

n 

independent   distribution  of    {e    }   has   a  characteristic  function,   possibly 
improper,   given   by 


(UB^oj")  (1+6^0)') 
(oo)    =  ^  ^ 


c      '         (1+u)^)[(1-a^-a2)6^'B|w'*    +   {(l-a^)B^'    +    (1-a2)B^}a)"    +   1]* 


(II. C. 2.11) 


It  is  convenient  to  factor  the  quadratic  in  w^  in  the 
denominator  of  (II. C. 2. 11).  The  roots  of  this  factor  are  both  real  and 
distinct.      To  see  this,   note  that 


{(l-a^)e^    +    (l-a^)^^}'   -   Ml-a^-a2)6^B| 


=   {(1-a^)6^   -    (1-02)6^^    +   ^a^a^B^'B^   >  0 


The   roots    are   also   both  negative,   which   can  be  seen  by  noting  that   the 
product    I", 1^2   ^    1 /( 1 -a.  ttp)  B?  6p    is    positive,    but    the    sum    r       +    r 
=   -{(1-a^)6^    +   (1-a2)62}/(1-a^-a2)6^62   is   negative. 

Letting  r     =    -1 /b^  and  r     =    -1/b^,    we    can   rewrite    (II. C. 2. 11) 
using  partial   fraction  decomposition  to  obtain 


♦  ^(.)    -    (l-p^-p^)!^)    .   P^{^]    *   Pjlf^^).  (II.C.2.12) 


From    (II. C. 2. 11) 


and 


b|   +  b^   =   (l-a^)B^    +    (1-a2)B^   =   s  (II. C. 2. 13) 


b^b|    =    (1-a^-a2)6j6^   =   r.  (II.C.2.1M) 


Comparing   (II.C.2.12)   and   (II. C. 2. 11)    term  for   term  also  yields 


p^d-bp    ^  P3(1-b^)    =  a^S^^    +  a^B^  (II. C. 2. 15) 


and 
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p^d-bpb^  +  p^(1-b^)b|   =   (a^+a2)B^6|.  (II. C. 2. 16) 


Expressions  for  b^,  b^,  p  and  p^  are  obtained  in  terms  of  a,, 
a„,  Q.  and  &  ,  by  solving  (II. C. 2. 13)  -  (II. C. 2. 16).  From  solving 
(II. C. 2. 15)  and  (II. C. 2. 16)  simultaneously,  we  obtain  (II.C.2.5)  and 
(II.C.2.6).  Equations  for  b^  and  b^  given  in  (II.C.2.7)  are  obtained 
from  solving  (II. C. 2. 13)  and  (II. C. 2. 14)  simultaneously.  Arbitrarily, 
let   bp  be  the   larger   value. 

It  remains  now  to  show  that  the  inversion  of  (II. C. 2. 12)  will, 
in  fact,  yield  a  function  that  is  a  probability  density  and  is  the 
mixture  of  densities  for  scaled  Laplace  variables.  To  do  this,  we  show 
that   Pp  and  p     can  be   interpreted  as   probabilities   and  that   p     +  p     <    1. 

To  establish  that  p  +  p  <  1,  we  have,  after  adding  (II.C.2.5) 
and    (II.C.2.6) 


(a  6^+a   Bp    -    (a  +a JS.^Bp 
P^   ^  P3   =  (i-ba)(i-pa) .  (II.C.2.17) 


Multiplying  out    (1-b|)(1-b^)   and  using   (II. C. 2. 13)    and    (II. C. 2. 14),    we 
have,   after  some  rearrangement 


(1-Bp(1-Bp 
^2   "  P3   ^   ^    "    (l-B^d-ep    +  a^B^^(l-Bp    +  a^B^d-B^    '  (II-C.2.1< 


This  expression   is    clearly    positive    and   less    than   one,    from 
which  it  follows   that   p   +p   <1. 
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To  show  that  p^  and  p  are  probabilities,  it  remains  to  show 
that  they  are  both  positive.  To  do  this,  it  is  shown  that  the 
numerators  and  the  denominators  of  (II.C.2.5)  and  (II.C.2.6)  are 
positive.  For  the  denominators,  this  requires  that  0<bj  ,  b^  <1,  which 
is  shown  by  noting  0<(1-b|)  (1-b^Xl .  From  (II. C. 2. 17)  and  (II. C. 2.  18), 
it  follows   that 


(l-b^d-b^)    =    (1-6p(1-6^)    +  a^6^^(1-6p    *   a^B^d-Bp    >  0, 


Also, 


1    -    (1-b^)(1-b^)    =   (b|   +  bp    -   b|b^ 


=    (l-a^B^    +   (1-ct2)B^  -    (1-a^-a2)6^'6| 


=    (l-a^B^d-B^    +    (1-a2)B|(l-B^)    +    ^]^\    >   0 


Therefore,    b^   and    b|   are   less    than   one,    so    p-    and    p^    have 


positive   denominators. 


To  see  that   p     and  p      have    positive    numerators,    note   that    it 


must  be  true  that 


(a   +a   )B'Bp 
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Using   (II.C.2.8)   and   (II.C.2.9),    (II. C. 2. 19)    is  equivalent  to 


1/2    .    _         .    .    /_2    ..-nI/S 


or 


or 


-(s^-Mr)  <   2b  -   s  <    (s^-Mr)        , 


s^  -  4r   >   (s-2b)S 


sb  -  b^  -   r   >  0.  (II. C. 2. 20) 


But   the   lefthand  side  of    (II. C. 2. 20)    is 


a^a2B^'B|(S^'-6|)' 
(a^B^+a^Bp^ 


which   is  strictly  positive. 

Therefore,   p     and  p     are  both  positive   and  p   +p   <  1  .      Therefore, 

p^,   p^  and   1-p^-p_  can   be  regarded  as   probabilities.      Therefore  e      has   a 
23  2      3  i^ 

proper   density  which   can   be   generated  as    the  mixture   of    three   Laplaces 

with  scale  parameters    1,    |b    |    and    |b    |,   respectively.  Q.E.D. 

The   general  NLAR(2)   model    uses    the  four   parameters   to   achieve    a 

wide    range   of    sample    path   behavior.      Figure  II.C.2.1    illustrates   four 

different    realizations    of    the    NLAR(2)    process.       In   each    case,    the 

theoretical    autocorrelations    are    identical    with    p(1)    =   0.64    and 

p(2)    =  0.5.      Also,   note  that    each   sample    path   was    generated   from    the 

same  i.i.d.   standard  Laplace  sequence    {L   },    such  that    (X    ,X    )    =   (L   ,L   ). 

Since  this  is  not   the  steady  state  bivariate  distribution  of    (X    ,X      ,), 

^  n  '    n-1 

the    sample   paths   illustrated  in  Figure  II.C.2.1   are  displayed  beginning 
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with  X^-,    to  avoid  the   initial   transient   behavior   of    the    process.      The 
501 

true  value  of  each  parameter  is  displayed  above  the   corresponding  sample 

path.      Figure  II.C.2.2  contains    the    scatter    plots    for    each   sample    in 

Figure  II.C.2.1.      The  sample  size   in  each  plot   is  2000. 

Many  special   cases   of   the  NLAR(2)   model   could  be  mentioned.      The 

following  have  one  or  more  of   the  parameters  at  their  boundary  value  and 

have  valid  but   less   complicated  results  for  the  distribution  of    {e    }    in 

(II.C.2.4).       If    a,    =   ap  =  0,    then    {e    }    is  the   i.i.d.   sequence    [L   }    and 

X     =   e    .      If  ct,    =    1    then    {e    }    is    the    innovation   of    the   LAR(  1  )    model 
n  n  1  n 

derived    from     (II. B.I. 7)    and    (II. B.I. 8).       If    \&    \     =    |  B^l     =    1    and 
a-    +  a„    <    1    then    each    e      is    distributed   as    a   scaled   Laplace    random 


variable,    /   l-a^-a^    L    .       These  models    are  called  the  TLAR(2)   models, 

which   are   easily   extendable   to   higher-order    aut or egr ess i ons  ,    as 

discussed  in  Section  II. E.      If  a     <    1    and  ap  =  0  or    6p  =  0,    then    {e    }    is 

the   innovation  of    the   new   first-order    aut or egr essi ve   model    NLAR(1). 

This  model    is  the  subject   of  Section  II. D. 

3.      Autocorrelation  Structure 

In    this    section,    it    is    shown    that    the    autocorrelations 

p(8,)    =   Corr(X    ,   X   _    ),    i   =  0,±1,±2,...   of    the  NLAR(2)    model    satisfy    the 

Yule-Walker   type   difference  equations;    thus   the  second  moment   dependency 

aspects   are  indistinguishable  in  form  from   those  for  the  AR(2)    process. 

We   also    compare   the    admissible  regions   of   an  AR(2)   with   (i)   an  NLAR(2) 

with   4  parameters   and   (ii)   an  NLAR(2)    with  only  two  parameters. 

From    the    independence    of    {K    }    and    {K',    K"},    and    (II. C. 1.1), 

n  n        n 

(II. C. 1.2)    and    (II.C.2.4),    we   see   that    E(K')    =   a    ,    E(K")    =    a^    and 
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E(e    )    =  E(K    )E(L  )    =  0.      Multiplying   (II. C. 1.1)    on  both  sides   by  X      „    we 
n  n         n  n-x, 

have   for    I   >    1.    E(X^X„.^)    -  c,  6,  E(X^.,X^.^)  *  c^B^ECX^.^^-l '  • 

Dividing   by    Var(X    )    we  have    p(-S,)    =  a,B,p(2,   -   1)    +  a^B^pCJ,   -  2),   since 
p{-i)    =   p(8-).      Substituting  a.  6.    =  a.    for   i   =   1,2  and   p(0)    =  1  ,   we  have 


p(1)    =  a^    +   a^pd) 

p(2)    =  a^pd)    +  a^,  (II. C. 3.1) 


which  are  the  same  equations   as   those  which  occur  for  the  AR(2)   process. 

Since  |8.|  i  1  for  i  =  1,2  and  a  +  a^  ^  1  in  NLAR(2),  the  usual 
triangular  admissible  region  for  AR(2)  given  in  Box  and  Jenkins 
[Ref.  23:  p.  61]  shrinks  to  the  interior  of  a  diamond-shaped  area  in 
(a.  =a-B,,  a=apBp)  coordi nates:  |a|  +  la^j  ^  1.  (See  Figures 
II. C. 3. la  and  lb).  In  (p(1),  p(2))  coordinates  the  equation 
p(1)^  =  (1  +  p(2))/2  defining  allowable  combinations  of  p(1)  and  p(2)  in 
AR(2)  also  changes.  For  NLAR(2),  the  space  in  (p(1),  p(2))  coordinates 
becomes  a  triangular  region  bounded  below  by  |p(1)|  =  p^H  +  p(2)}.  (See 
Figures   1 1.  C.  3.  2a  and  2b). 

The  reduction  in  allowable  parameter  or  correlation  combinations 
for  NLAR(2)  over  the  AR(2)  model  is  not  large.  This  encouraged  us  to 
consider   a  2-parameter  NLAR(2)   model    by  specifying  a.    =   B? ,   for   i    =    1,2, 

so  that   a.    =   B?.      The   parameter  space   in    (a^  ,a^)   coordinates   becomes   the 

11  ^  12 

3/2 
symmetric  region  bounded  by  the   curves    Bp    =    ±    (  1    -    Bf)  (see   Figure 

II. C. 3.  1c).       In    (B^,    Bp)    coordinates   the   admissible  region  of   the  two 

parameter  model    is  bounded  by  the   unit  circle   B?      +   Bp  =   1  •      Using   only 
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two  parameters  leads  to  the  admissible  region  in  Figure  II. C. 3. 2c  for 
(p(1),  p(2))  space.  The  (p(l),  p(2))  space  was  obtained  by  transforming 
the  lines  62  =  a^  =  c  ,  -1  ^  c  ^  1  ,  in  Figure  II. C. 3. 1c  to  p(2)  - 
(1-a2)p(1)'+a2.  where  |p(1)|  <  a^/Cl-a^)  =  6^V(1-6p  and  6^  =  (1-8p^^^ 
if  a^  ^  0  and  6|  =   -(l-Bp^'^^   if  a^  <   0. 

All  the  plots  in  Fig.  II. C. 3.1  were  generated  from  a  grid  of 
equally  spaced  values  of  a  and  a  .  In  Fig.  II. C. 3. la  the  points 
satisfy  the  Yule-Walker  equations  (5.1).  In  Figs.  II. C. 3- lb  and  Ic,  the 
points  also  satisfy  the  conditions  of  Theorem  1.  In  Fig.  II. C. 3-2  the 
feasible  combinations  of  p(1)  and  p(2)  are  plotted  for  those  values  of 
a  and  a  from  Fig.  II. C. 3.1  using  the  Yule-Walker  equations  (5.1). 
M.      Directional  Moments   and   Partial  Time  Reversibility 

In  the  last  section,  we  demonstrated  that  the  second  moment 
dependency  aspects  of  the  NLAR(2)  model  were  indistinguishable  in  form 
from  those  of  the  ordinary  AR(2)  model.  Also,  it  is  well  known  that  if 
the  linear  autoregressi ve  model  is  not  Gaussian,  then  the  process  is  not 
completely  determined  by  the  first  and  second  moments.  Thus  in  model 
identification  it  becomes  necessary  to  examine  third  order  moments  to 
further  identify  the  process.  Special  third  order  moments  E(X^  >^  ^n)» 
for  all  i,  are  known  as  directional  moments.  If  the  directional  moments 
for  all  i  are  equal,  which  is  necessary  for  a  process  to  be  fully  time 
reversible,  we  say  the  process  is  partially  time  reversible  in  the  sense 
of   directional   moments. 

A  process   is  fully  time  reversible   (Lawrance    [Ref .    27])    if    the 

joint   distribution  of  X    ,   X   ^^  ,    ....   X^^.^.    is  the  same  as   that   for  X^^^, 

X  ,    ...,X     for   all   r  and  for   all   n.      Since  LAR(1),   a  special   case  of 

n+r-1  n 
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NLAR(2),    is    not    fully   time   reversible,   NLAR(2)    is  in  general   not   time 

reversible. 

In   this    section   we   show    by    induction   arguments    that   all   the 

third  order  moments   of  NLAR(2)   are  the  same  as   those  for  Gaussian  AR(2) 

model;    i.e.    E(X.X.X    )   =  0  for   i,   j,   k.      This  implies   particularly  that 
1     J     K 

the   directional  moments   of  NLAR(2)    are  equal   and  therefore  that    NLAR(2) 
is  always   partially  time  reversible. 

In  Section  II. B,  we  found  that  E(X?)  =  0  for  all  i  since  X.  is 
marginally  Standard  Laplace.  It  is  easy  to  establish  the  following  two 
equations: 


^^Vn-l)    =   ^2-2^^^nVl^'  (II. C. 4.1) 

ECX^X      .)    =    {(6^a_)/(1    -26.B^a,a_)}    E(X   X^^    J.  (II.C.4.2) 

n  n-1  2  2  12  12  n     n-1 


Solving    (II. C. 4.1)    and    (II.C.4.2),    simultaneously    yields    E(X   X^_    )    = 

E(X^X      J    =   0. 
n  n-1 

Now,   using  separate   induction    arguments    and    the    stationarity 

assumption,    we    establish    that    E(X    X^     „)    =   0    for    all    i    i    ^,    and 

n    n-  i 

E(X^X      , )    =   0   for   all    k   ^    1. 
n  n-k 

The   proof  of  E(X  X^_    )    =  0   is  straightforward. 

To  prove  E(X^X      ,  )    =0,    we  first    show    that    the    expectation   of 
^  n  n-k  '  ^ 

special    third   order    moments    of    the  form  X  X      ,X     ,    for   k  S   2   is   zero. 

n  n-1    n-k 

Define   ja,     =   E(X  X      ,X      ,  )   and   assume    E(X2X       .)    =    0,     1    <    k    -    1.       From 
k  n  n-1    n-k  n   n-j  ^ 


(II. C.I  .1)  , 
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Now  from    (11.0.4.1)    and   (II.C.4.2),   we  have 


M.    =   E(X  X^    ,)    =  a_B^E(X^X      J    =  0.      Therefore   y,     =  0. 
1  n  n-1  2  2       n  n-1  k 


We  now  proceed  to  show  that  E(X.X.X,  )  =  0  for  all  i,  j,  k. 

1    J    k  -^  ' 

Without   loss   of   generality  let   i   <   j    <    k   so   that    k    =    i    +    n,j    =    i    +    m 

and    n    >    m.      Therefore    by    stationarity    E(X.X.X,  )    =    E(X.X.       X.       )    = 

^  ^  ijk  ii+mi+n 

E(X.X.     ,        ^X.       ).      Fixing  m  so  that   0   <    m   <    n   we    use    induction   on    n. 
1    i-(n-m)    i-n  * 

Let   n  =   2,    implying  m  =   1.      The  first  step   in  the   induction  follows   from 

E(X.X._     X._     )     =     M        =    0.        Next     assume     that     for     m     <     n     <     K, 

E(X.X.     ,  ,X.       )    =    0.       Now   we   show  that   E(X.X.     ,^   ,       ,X.     ,,^    ,,)    =   0. 

1    i-(n-m)    i-n  i    i-(K+1-m)    i-(K+1) 

Using    (II. C. 1.1),   we  write 


^^^i^i-(K-Hl-m)^i-(K^1)^    =   °'l^^^^i-1^i-(K+1-m)^i-(KH)^ 

+   E(e.X._^l^^^_^^X._^j^^^^). 


NOW    E(B,X..(^^^_^)X._(j^^^))    =    E(e.)E(X._(j^^^_^)X._(^^^))    =    0    and 

^^h-^h-iK^^-m)h-iK^^)^   -   ^^hh-iK-m)h-K^    -   °    ^y    stationarity   and 

the     assumption.         Likewise     E^^|_2^i-(K  +  1-m)'^i-(K+l)^     ^ 

E(X.X.    ,,„    ,,      ,X.     ,,^    ,s)    =  0.      This  completes   the   induction. 
1    i-{(K-1)-m}    i-(K-l) 

An    immediate   result    from    the    argument    about   third  moments   is 

that   Z      =   X     -   X      ,    for    {X    }    of    the   NLAR(2)    is  not   skewed, 
n  n  n-1  n 
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The  residual  analysis  in  [Ref .  6]  and  [Ref .  22]  using  cross 

correlations  between  linear  autoregressive  residuals  R  =  X  -  a.X   ,  - 

n         n  1    n- 1 

a^X      _,    and    their    squares    R^,    does    not    shed   any   new   light    on   the 

directionality/reversibility  in  the  NLAR(2)   model  or  help  in  identifying 

the    appropriateness    of   the  Laplacian  model.     This  is  because  all   third 

lents    have    zero    expectation.       Thus,    we    see    that    E(R^R       .)    = 

n    n  +  ic 


mom( 


E(R   R^      „)    =  0   for   all    Jl. 
n     n+Jl 

Note  that  the  basis  for  the  residual  analysis  using  the  {R  } 
process  is  that  this  process  is  uncorrelated  but  not  necessarily 
independent.      The  moment   results  show  that   the  R    's  have   zero   skewness. 

In   fact,    it    is    easy   to  show  that  the  distribution  of  R      is  the  same  as 

'  ^  n 

the   distribution  of    -R    .       Thus    the    R    's    are   symmetric   although    they 

will,   of   course,   not   have  Laplacian   distributions. 

In   Chapter    IV    of    this    thesis,    a    residual    analysis    based    on 

certain  fourth-order  moments   is  presented. 

D.       THE    NEW   LAPLACE    FIRST-ORDER   AUTOREGRESSIVE   MODEL,    NLAR(  1  ) 

1  .      Introduction 

The  new  Laplace    first-order    autoregressive   model    is    another 

special    case   of    the    NLARMA(p,q)    model    when   q=0  and  p  =  l  .      This   is,   of 

course,   a  special    case  of   the  NLAR(2)   model,   where    either    a^    and/or    ^ 

are    zero    in    (II. C. 1.1).      Examples   of   the   different   sample  path  behavior 

obtainable  from   the  NLAR( 1 )    Process   are  given  in     Figure  II. D. 1.1.      Note 

that    each    sample   has    the    same    value  of   lag-1    serial    correlation,    i.e. 

p(1)    =  Corr(X    ,X  _.).      In  Figure  II. D. 1.2  are  the   corresponding   scatter 

plots    for    the  samples   in  Figure  II. D. 1.1.      In  the  scatter   plot  labeled, 

"a  =1",   the   distinctive  regression  line,   x      =   px      ,    is    clearly   visible 
1  n        ^   n-1 
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for    the   LAR(1)   process.      This  is  produced  as   explained  in  Section  II. B, 

because  the   innovation,    e       can   be   zero  with  non-zero  probability. 

The    two-parameter    autor egressi ve   model    generates    an    {X    } 

n 

sequence  which  satisfies 


n         n   1   n-1  n' 


(II. D. 1.1) 


where 


K'    = 

n 


and 


1        w.p.   a. 
0        w.p.    1-a. 


n   1    n-1  n  n 


K     = 
n 


w.p.    1-p, 


/   1-a^ I ^1 l^n   ^'^'    ^2 


(II. D.I .2) 


p^   =  a^B^Vd    -    (1-a^)6p. 


(II. D.I. 3) 


Also,    {K'},    {K   },    {L   }    are  i.i.d.   sequences    independent    of    each   other 
n  n  n 

and  independent  of  X     ,,   X     ^ 

n-1        n-2 

From    (II. D.I. 2)    and   (II. D.I. 3),   we  see  that  the   inversion  of   the 

—  1 /2  —  1 

characteristic    function   for    e    ,   letting   X   =    (1-a.)  (|6J)      ,   gives 

for   0<a.<1 


f      (x) 
n 


^^'P2^      -|x|         '^2     -Alxl 


(II. D.I .H) 


which    is    a    convex   mixture   of    Laplace    densities.      This    result    also 
follows    directly   from   Section   II. C. 3,    since    the    NLAR( 1 )   model    is  an 
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NLAR(2)  model.  Likewise,  the  correlation  structure  and  partial  time 
reversibility  in  the  sense  of  directional  moments  are  the  corresponding 
results  for  the  NLAR(2)   model  with  a2=0  or    Q^=0.      That   is 


Corr(X    ,X        )   =   (aB)''^'      for  all   k  =  0,    ±1,    ±2,    ...  (II. D.I. 5) 


and 


n'  n-k' 


ECX^X     ,  )    =   E(X  X^   ,  )    =  0      for   all    k   =  0,    ±1,    ±2.  (II. D.I. 6) 

n  n+k  n  n+k 


We  can  rewrite   (II. D. 1.1)   as 


n        .    j-1 

C     =   e     +      I      B^(    IT  K'    .)e      ..  (II. D.I. 7) 

n         n        .f^      1    .^Q  n-i     n-j 


om 


From    (II. D. 1.1),    it   is  clear   that  X      depends   only  on  X      ,    and    e    .      Fr 
'  n       ^  •^  n-1  n 

(II.  D.I.  7),   we  see  that  X      ,    is  independent   of   e     ,     for   all    kSO.      Hence 

n-1  n+k 

{X   }    is  a  first-order  Markov    process    and   starting   X„    with   a   standard 
n  0 

Laplace   distribution  makes    {X   }   stationary. 

The  remainder  of   this  section  is  devoted  to  specific  results  for 

the    NLAR(1)    process    which   have   not    been   shown    in   the   more   general 

NLAR(2)   model.      The  extension  of   these  results    to   the   NLAR(2)    process 

would   require    the   joint    distribution  of    {X    ,X      ,  ,X     ^},    which  has   not 
^  ^  n     n-1 '   n-2 

been  derived.      The   conditional   density  of   X     given   X      ,    is    derived,    as 

•^     n  ^      n-1 

well  as  an  expression  for  the  joint  distribution  of  the  X  .   The 

n 

distribution  for  the   differences   Z   =X  -X      ,    is  also  derived.      Parameter 

n     n     n-1 

estimation   is    discussed    in   the   context   of  moment   estimators   and  least 
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squares  using  the  linearized  residual.      The    problems    with   finding   the 

maximum  likelihood  estimators   of   a     and   B-    are  also  addressed. 

2.      Conditional   Density  and  the  Joint   Density  of    (X    ,...,X    ) 

To    find    the    conditional    density   of    X    ,    given   X      ,  ,    we    use 

^  n      *  n^l 

(II. D. 1.1)   -    (II. D.I. 4)    to  evaluate  P(X    <x    Ix      ,).      We    have    for    a,<1, 

n  n '  n*^  1  l 

which  eliminates  the  LAR(1)  process, 


P(X  <x  IX  J  =  P(K'6,X  ,  +  £  <x  Ix  J 
n  n'  n-^1       n  1  n-^1    n  n'  n-^1 


=  a,P(e^<x^-BiX^^. )  +  ( 1-a, )P( e^<x^) 
1   n  n   1  n^l        1    n  n 


X  "BiX   ,  X 

n  1  n"!  n 

=  a,  f        f  (x)dx  +  (l-aj  f  f  (x)dx.     (II. D. 2.1) 
11         e  lie 

1         n  ^    n 


Differentiating    (II. D. 2.1)    with   respect    to    x      yields    the    following 
expression  for   a,<1, 


^Y       lY  (^nPn^l)      =     ^  1  f' .      (^n^^B^X^      .)      +      (l'a.)f^      (X^).  (II.D.2.2) 

XX^n'n-^1  1e       n1n-1  1e       n 

n '    n-"  1  n  n 


Examples   of    (II.D.2.2)    for   a   fixed    x      ,    and    fixed    T    =    a,B,    =    .64    are 

n-^  1  11 

given   in  Figure   II. D. 2.1. 

Now  we  can   write    the   joint    density    f^    „         (x    ,    x      , )    as    the 

^  -^       X    X       ,       n        n^  1 

n    n-^  1 

product    f^,    K,         (x    Ix      ,)f^         (x      ,).       In    fact,    the    n-'dimensional 
X      X      ,       n '    n"- 1      X      ,       n-"1 
n  '    n^'l  n-'l 

distribution  of  X    ,...,X,    is  obtained  using  this   product   recursively    to 
n  1 
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obtain  the  density 


^X  ...x/^n ^1^   ^X  IX   /^n'^n-l^  ^X    Ix  ,^^n-ll^n-2 

n    1  n '  n'l  n^l  '  n'-Z 


%|x/^2l^l)  ^x/^1^- 


(II.D.2.4) 


3.  Distribution  of  Differences  and  P(X  ,>X  ) 

n^^      n 

We  now  consider  the  distribution  of  the  difference  Z  =  X  'X 

n    n  n^-l 

Using  (II. D. 1.1)  '  (II. D.I. 4)  and  the  fact  that  e   is  a  convex  mixture 

n 

of  Laplacian  random  variables,   we  used  partial   fraction  decomposition  to 

invert    the    characteristic    function   of    Z      to   obtain    the    following 

n 

expression  for  the  density: 


f„  (y)  =  exp{-|y|/(1-Bi)}^ 


a^(l-B^) 


P2  (l-'Ps) 

{(l-B^)'-o^}  ^  6^(2'B^) 


-  exp(-|y|/o)(op2/2)  j,.  .  ^..g ja   ~j^. 


(l-a^) 


-exp(-|y|)  1— jT^ 


(1-a^)P2    (1-a^)(1-p2)    a^d^p^) 


B^(2-B^) 


+  (1-P2)(1'a^)|y|exp(-|y|)/4, 


(II. D. 3.1) 


where  o^  =  (l--a.)B?. 
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One    immediate  result  is  that  f      (y)    is  symmetric  about   zero  and 

n 

therefore,  P(Z  <0)  =  P(Z  >0)  =  1/2.   This  demonstrates  one  additional 

feature  of  the  partial  time  reversibility  of  the  NLAR(1)  models;  i.e., 

probabilities  of  a  run  down  (X  >X  ,  )  and  a  run  up  (X  <X  ,  )  are  the 
'^  n  n'^  1  n  n^  i 

same.  To  evaluate  probabilities  of  higher  order  runs  would  require  the 
joint  distribution  of  the  sequence  {Z  }.  This  result  has  not  been 
obtained  for  the  NLAR(2)  model. 

4.   Estimation  of  Serial  Correlation 
a.   Introduction 

The  purpose  of  this  section  is  to  present  estimators  of  the 
two  parameters  a,  and  8,  whose  product  is  the  correlation  coefficient  in 
the  NLAR(1)  models.  We  assume  throughout  this  section,  unless  otherwise 
stated,  that  {X  }  has  a  standard  Laplace  (ij  =  0,  X  =  1)  marginal 
distribution.  Estimation  of  \i  and  X  for  models  that  have  marginal 
Laplace  distributions  are  discussed  in  Chapter  III.  We  also  only 
consider  the  random  coefficient  models  of  the  NLAR(1)  process,  i.e.  a<1 , 
thus  eliminating  the  LAR(l)  model.  As  was  shown  in  the  introduction  to 
this  chapter,  for  a  =1 ,  6,  can  be  estimated  very  efficiently,  thus 
eliminating  the  need  for  further  discussion. 

The  method  of  moments  is  used  first  to  find  an  estimator  of 
y  =  aiB-,'  The  joint  moment  estimators  of  a.  and  8,  are  calculated  from 
f ourth^'order  moments.  These  estimators  are  used  later  in  an  iterative 
procedure  to  obtain  the  joint  least  squares  estimators  of  a  and  8.. 

A  least  squares  estimation  procedure  is  defined  for  the 

NLAR(1)  models  using  the  usual  linear  residual  R   =  X  ^a,8,X 

n     n   1  1  n-=-l 
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Minimizing  the  sum  of  R^  leads  to  the  usual  estimator  of  Y  as  given  in 

n 

standard    texts    on    time    series.       In   order    to    estimate   a,    and    6. 

individually,   we  minimize  the  square  of  a  particular  function  of  R     with 

respect   to  a     and   3.  . 

In  the  last  part  of  this  section,   the    problems   of   maximum 

likelihood  estimation  in  the  NLAR(l)   process  are  discussed.      Although  no 

results   are  presented    for    the    general    model,    the   maximum    likelihood 

estimator  of   the   correlation  coefficient   in  the  TLAR(1)   model   is  given. 

b.     Method  of  Moments 

(1)       Estimation   of    Y  by   Second^Order  Moments.      Since  X      is 
1 n 

assumed  to  have  a  standard  Laplace  distribution  with  E(X  )  =  0  and 
Var(X  )  =  2,  an  immediately  obvious  choice  for  estimating 
Y  =   Corr(X    ,X      , )    is   the  following  product  moment: 


1      ^ 
'--^) .  (II.D...1) 


Taking  the  expectation  of   Y  and  using    (II. D. 1.1),   we  have 


^'"^'    ■  JUP^J  Jj    ^"^I'^l-l'    ■    2(^  J2    ^^l"!    ■   ^«1    ■   '■       ("-O-^-^' 


so  that   the  estimator   is  unbiased. 

(2)      Joint    Estimation   of    a. and    $  by  Fourth-Order  Moments. 

The    expectation   of    fourth^order    moments    can    be    calculated    using 

(II.D.1.1)    and  the  fact   that    {X    }    is  a  stationary  process.      For   example 

n 
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E(XJX.^^)    =    12a^B^{1    +    (2-a^)6^}    , 


E(XJX^2^^)    =   4(1+5a^e^^)    , 


ECX.XJ^^)    =   24a^B^     , 


^^^i^i-1^i-2^    =   4a^6^{1+2a^B^^+3a^(2-a^)g:[}. 


(II.D.4.3) 
(11.0.4.4) 

(II. D. 4.5) 
(II.D.4.6) 


Solving  for  a  and  6  in  different  pairs  of  these 
equations  gives  the  estimators  based  on  fourth^order  moments.  It  is  to 
our  advantage  to  use  the  expressions  with  the  lower  order  moments  where 
possible.  Therefore,  using  E(X?X?^  )  and  E(X.X.^  )  instead  of 
E(X.X?^  ),  we  solve  for  the  following  expressions  for  the  joint  moment 
estimators   of   a,    and   B. 


'^I    = 


y  X .X .  , 

i=2       '    '^' 


(n-1) 


I     x^x^    ,    -   4(n-1) 
^i=2      '    '^' 


(II.D.4.7) 


I    (x^xr=^    )    -   4(n-l) 

i=2  

n 
10      I     x.x.    , 
i=2 


(II.D.4.8) 


From  the  scatter  plot  analyses  in  Figures  II. D. 4.1  and 
II.D.4.2,  we  see  an  example  of  the  behavior  of  this  pair  of  estimators 
when  a  =  B^  =  .8  in  the  NLAR(1)  model.   Both  scatter  plots  contain  500 


pairs  (a  ,B.)  derived  first  from  samples  of  size  250  and  then  fr 


om 
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samples  of  size  2500.  It  is  clear  from  the  equations  (II. D. 4.7)  and 
(II.D.4.8)  that  Y  =  ol  Q  .  The  hyperbola  can  be  seen  in  both  scatter 
plots.  Both  parts  are  visible  for  sample  size  of  250.  However,  for 
pairs  derived  from  samples  of  size  2500,  only  the  part  in  the  first 
quadrant  is  visible. 

From  the  Normal  probability  plots  in  Figure  II.D.4.3, 
there  is  little  evidence  of  non-'Normality  for  Y  =  a,  6.  for  N  =  250,  and 
less  for  the  estimator  derived  from  samples  of  size  2500.  However, 
individual  estimators  a.  and  3  look  far  less  Normal  for  both  sets  of 
sample  sizes. 

c.   Least  Squares  Estimation  in  the  NLAR(1)  Process 

(1)  The  Linear  Residual.  The  properties  of  the  linear 
residual  are  developed  for  use  in  deriving  the  least  squares  estimators 
of  Y  =  ct,6,  and  for  a,  and  B  jointly.  We  begin  by  rewriting  (II. D. 1.1) 
in  the  RCA(1)  form  as  given  in  (II. C. 2.1).  We  have 


K   =   ctiBi^^.i  +  6i(K''a.)X^  .  ^    e^.  (II. D. 4.9) 

n    1  1  n^l    1   n   1   n-1    n 


From   this    expression,    there    are    clearly    two   ways    to   write    down    the 

linear  residual,    R   .      The  usual   one  from  linear  theory  is,   of   course 
'     n 


R      =   X  'a,B,X      , .  (II. D. 4. 10) 

n         n     1    1    n-"  1 
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However,  a  particularly  useful  way  of  looking  at  it  is  from 


R^  =  3.(K'-ajX^^  +e^.  (II. D. 4. 11) 

n    1  n   I  n^l  n 


It  is  from  (II.D.4.11)  that  we  see  explicitly  how  the  i.i.d.  innovation, 
{e  },  and  the  coefficient  {K'^^a,}  processes  impact  on  the  linear 
residual. 

Let        (^  be    the    a^algebra   generated    by    [  {  (  K  '  ^a ,  )  ,  e    }  ; 

1<  =  1  ,  .  .  .  ,n''1  ]  .  Intuitively,  (T^,,  ,  represents  all  the  information 
about  the  process  up  to  time  n-^1  .  Conditioning  on  ^_  ,  we  have  the 
following  two  useful  properties  of  R  as  noted  by  Nicholls  and  Quinn 
[Ref.    16:      p.    42]. 


E(Rj^|     cT^.,       )    =   0.  (II.D.M.12) 


E(R^I     <?)^-.        )    =   BtVar(K')x^^,    +  Var(e^)  (II.D.J4.13) 

n '     ^ "  '  1  n     n-^l  n 


These  results  follow  because  X  ,  is  a  function  only  of 

n^l  •^ 

the  process  through  (n-'l)  and  (K'-^a,)  and  e  are  both  independent  of  it. 

n   1       n 

(2)   The  Least  Squares  Estimator  of  Y  =  a, 6,.   Using  R  from 
1  1         n 

(II. D. 4. 10)  and  a  given  sample  from  {X  },  we  obtain  the  least  squares 

n 
estimator  by  minimizing  the  sum  1   R?  with  respect  to  the  product  a, 6. 

i=2  ^ 

which  is  now  called  T.   We  have 
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n 

I   X.X.  , 

y  =  — ,  (II.D.4.15) 

n 

i  =  2 


which,  in  fact,  is  the  usual  expression  for  the  estimation  of  serial 
correlation  in  linear  AR(1)  models  as  given,  for  example,  in  Chatfield 
[Ref .  31 :   p.  66]. 

Since  the  NLAR(1)  process  is  an  RCA(1)  process  of 

Nicholls  and  Quinn,  it  follows  from  their  theorem  [Ref.  16:   p.  4M]  that 

1 
Y  is  strongly  consistent,  asymptotically  unbiased  and  — — — -(T-'Y)  has  an 


/  N 
asymptotic  Normal  distribution.  The  asymptotic  variance,  from  the  same 

results  of  Nicholls  and  Quinn,  is 


0^  =  1  +  5a^6^^  -  6(a^B^)^  (II. D. 4. 16) 


Figures  II  .D.  4.  4-^11  .D.  4.7  contain  the  boxplot  analysis 
of  SIMTBED  [Ref.  15]  output  for  selected  choices  of  a  and  6.  in  the 
simulation  of  the  least  squares  estimator  of  the  product  a. 6.  in  the 
NLAR(I)  processes.  Note  that  although  the  estimated  asymptotic  mean  is 
the  true  value,  Y  =  a.B,  =  .64,  for  each  of  the  four  sets  of  the 
parameters,  the  estimated  asymptotic  variance  of  the  estimator  of 
a  B,  =  Y  is  different  for  each  of  the  four  different  sets  of 
parameters.  The  simulation  results  reflect  the  asymptotic  theoretical 
results  for  the  NLAR(1)  processes  as  given  above. 


68 


0) 

4-1 


tntM\    <n 


•  •  -oo 


h-OV^    « 


I   1  cno/N 


o 

4J 

e 

•H 

+J 
(/) 

H 

W 
0) 

CO 

+j 
tn 

dj 


ujuio^ 


O         •     •     ^>^      f-«XA      rt 

-Hf\       o    ooo  I        <n»rv*    o^ 
icy        ^  vof*<v<    *n 


•  +«  I- 


-+000000— tf\ 


irw— »-   .  . 

Z«W 

ir\ 

.    •    lO* 

owvl 

UJ 

ooo  1 

— OCUi 

mUK 

l/>UJU 

'■■fy  ' 

c    O 

-J 

Zt/M/) 

&. 

<0'>— 

u.o> 

a. 

UJUJ</1 

OZW 

< 

zzo 

<Q 

V>UJ 

z         X>- 

z- 

SIM 

<OOuja: 

«£0 

=— 

w>-^itrj 

UJ<1- 

4/></) 

Zl/>(/)bO^ 

IX« 

O 

•H 

cn    II 

i-H 

03    JZ 


A-i 

0 

U) 

.H 

m 

Q4 

CJ 

X 

u 

O 

o 

PQ 

u 

CU 

Q 

tJ 

^^ 

P3 

rH 

H 

— •' 

2 

£X 

M 

< 

CO 

K-) 

Q 


0) 

d 


69 


uiUiT«<^    r*-»-ON    €0 


PM-^-flOaMO    ^.ao    \D 


xiMVst  •  •    tfvnso  o 

•   •  •oo    eorgm  ia 

OOO  I    I      (M    •o  N> 

Otro  \o 


ov^r^J 

<o 

•-(NJ 

»or-CTv 

f* 

oo 
1  1 

\OOI- 

Jf 

UiUL9t«1 

O^lA 

f.<M»-ON<- 

■<i>i\ 

M 

CTur\o<n^ 

(MooOdrno 

■  OO 

I 

vOf---    .    . 

in 

•    •    OO 

CM 

OOO  1 

OO 
NSVMnS 

OO 

o 

<N<o<Mr«n 

p*»f* 

lA 

voco—   .  . 

ITKOCO 

f^ 

o 

o 

•    .    OO 

«0«/V*- 

VO 

HA 

o 

tn^om 

pg 

N 

^ 

^l/VN 

cwoa«ocs 


(NJ  — 


1 

4b</).. 

Loaoi 

UJH-tn 

»nr>p*.»w^ 

OZl- 

jTvoOKOa- 

uuz 

■^fMOM-g) 

lO— 

o 

•  ■  -oo 

— o 

ir\ 

OOO  1 

OUJki. 
<Ouj 
ocoo 

CM 

UJ      CJ 

o 

>  1 
<     1 

ui^\n 

z 

U\^V>-*fi 

zoz 

0«00>.JTSJ 

o-o 

OVKT*fiO^ 

in— 

irv^r-    .   . 

ZV)V) 

m 

.    .    o^ 

OUJCO 

CN 

OOO  1 

— KUJ 

cnuK 
(/luJO 
UJOCW 

cc    a: 

UIOLi. 

Ul 

K     O 

-1 

XV»/I 

UJ 

a. 

<«- 

1^<J> 

r 

UiUJt/) 

OZU 

< 

zzo 

<a 

V>UJ 

Z          *!- 

z— 

CDKJ 

<aouii£ 

«ca 

3— 

UJI-l-itO 

LJ<I- 

V>V> 

x»<nvi:£ 

l>v> 

uil  II 

<zx 


—  o-J 

in  h-< 

in  <o 

UJ  I- 

t£  — K 

O  f-OC 

u  cnuj 

K  UJ> 


5 

•H 

5 


it 


M-( 

0 

^ 

0 

4J 

g 

-r4 

-P 

CO 

w 

w 

aj 

U) 

M 

U5 

m 

Q) 

3 

U 

tr 

0 

CO 

M 

Pj 

-p 

05 

,,»— s. 

m 

rH 

(1) 

^— ^ 

J 

C^ 

< 

14-1 

i-l 

0 

2 

U) 

0) 

•H 

X. 

tn 

+J 

>i 

M 

c 

to 

•H 

+J 

1 

0 

11 

r-t 

r 

a 

00. 

X 

0 

-O 

CQ 

C 

(tS 

Q 

W 

(Ti 

m 

• 

Eh 

C/) 

in 

Q 


3 

■H 


70 


LJUJ^rj 


•    ■    OO       I  I 

OOO  I 


«o^«n    lA 


^(M 

OO 

uiui9m 

oooom 

ccKPO-m 

(MO^mTkO 

vor-rg   .  . 

ir« 

•    •    -OO 

f- 

OOO  1 

*■ 

CTXOOv 

CO 

•-rg 

C7W%3- 

eo 

OO 

■9 

UiUJt^-<7^ 

t-oir\ 

oo»r»-r. 

<«»/v 

^ 

jOVMOCg 

CM     .     • 

r- 

CNicorgtAff 

>oo 

1 

vOeofM    ■    • 

if\ 

•    ■    OO 

rw 

OOO  1 

O  UVM 

I  OO 

UJOOs  I    I 

(MOrgtr>rw 

O         ■    •    -OO  <7*0^     IT. 

— *n       o    OOO  t  mmtn    \o 

icg        f-  \o*rify 

OOO 


<»>OsOfsiiA     OZ>-      X 


X  I 

<  I 


HOOoo*n— if\ 


o 

.  .  oo 

— o 

%r\ 

OOO  1 

<Ouj 
OCUO 

rg 

Ui      (J 

o 

>  1 
<     • 

UMO^C 

z 

rgu-i3V>o 

zoz 

r-jp*^0 

o-o 

00— 

fcTv-Cg    .    . 

ztrtwi 

lA 

■   .   ^O 

OwbO 

CM 

OOO  1 

— KUi 

UJOCW 
WOu. 

UJ 

K      O 

^ 

Xtrttrt 

Ui 

ft. 

<(/!- 

u.o> 

z 

UUikK/1 

OZW 

< 

X20 

<o 

bOUJ 

z        3- 

z- 

mrst 

<aauja: 

<£ro 

=>— 

Lijt->-itr) 

UJ<»- 

t/}V) 

X(OV)v^:)C 

XXrt 

=— < 

OCCI 


O-l 

I- 


oa 

iH 

a 

H- 

(H 

0 

V4 

0 

■p 

(0 

£ 

•H 

+J 

w 

w 

M 

0) 

>^ 

(0 

3 

rr 

w 

4-1 

U) 

nj 

U) 

(1) 

V) 

J 

CD 

u 

iw 

o 

0 

^ 

(i 

en 

•H 

^-^ 

01 

rH 

>l 

■ — • 

r-f 

a 

ffl 

< 

5 

2 

4J 

OJ 

0 

x: 

r-( 

4-) 

n. 

X 

c 

o 

•H 

CO 

CO 

a 

w 

II 

CQ 

Eh 

ca 

en   a 

Q 


(1) 


71 


Z  I 

<  I 


oo  <o  -tn  ^ 

1  I  c*atn  o 

UMftJtfVN  •""^  ON 

<n3f*-00  CD*-'"'  NO 
NATVM     • 


I 


•-rg 

oo 

UIUOCC 

ootroON 

o«03v>r» 

csta^oo^ 

so-fg  .  . 

NfMS- 

o 

o 

•  •  «o 

cg(^r- 

lf\ 

\r\ 

OOO       1 

^   <^ 

CM 

CJ 

•om 

»A 

OvO    ' 

. 

mOiTN 

CO 

i-txa- 

<n 

•-fM 

OO 

UlUJOvO 

vaovoo»o 

sOf-.<n*-a- 

•-0>fi<MO 

vOCtMN    '    ' 

ir\ 

•    ■    -OO 

r^ 

OOO  1 

N  •-«co     O 


in**eo<ntr\       htt* 
ooi/vnm     I  oo 


CM 

o  dm 

I  oo 

U(M\C  I    I 

(OCMcos&cr  ujuj 

vocgo«oON  r-»rw5    \o 

O^irVNtO  l/NJCO      (M 

■  •  -oo  ovcf^    pg 

OOO  I   I  froo    vo 


0<Mr«-o^o 
OCgen<oo 


M  — 

O  k... 

Urt-eO  UJ>— 4/) 

ov-*r>o>rg  ozt- 

uv-cg    •   .  I  o — 

O       •    •   .OO  — o 

lA     OOO  I  U)u,— 


OUJU. 
KUO 


osoorgp.. 

irocni/vg 

iriTMcg  •  • 

•  •  oo 

OOO  1 


o— o 
to— 

ZC/HO 
OUjm 

— OCUJ 


-J  zvitn 

a-  <(/>— 

Z  UJUJt/) 

<  zzo 

muj  z       st- 

ffiNg  <ooujec 

3—  UJt— 1-^3 


OZU 

<o 

2— 
UJ<h- 


O-l 

►-< 
<o 

I- 
— >- 
t-o: 

(OUJ 

ui> 


Oi   Q) 


Q    -H 

W 


Eh 

H 
CO 


>X) 


>- 


Q 

H 
M 

a; 

3 

■H 


72 


An  analogous  result  is  given  in  Section  III.E.M,  where 
the  theory  of  least  squares  is  derived  for  the  Beta-'Laplace  AR(1)  model. 

(3)   The  Joint  Least  Squares  Estimation  for  a   and  6..   It 

n 
is  not  possible  to  minimize   V  R?  with  respect  to  a,  and  6,  individ-' 

ually.  However,  a  technique  from  Nicholls  and  Quinn  [Ref.  16:  p.  43]f 
which  uses  the  result  in  (II. D. 4. 13)  is  applicable.  As  was  pointed  out 
earlier  in  Section  II. C. 2,  by  assuming  nothing  about  the  particular 
marginal  distribution,  Nicholls  and  Quinn  were  free  to  treat  the 
variances,  o^  and  o^, ,  as  completely  independent  parameters  subject  only 
to  the  constraint  that  the  marginal  distribution  of  {X  },  whatever  it 

is,  has  a  positive  variance.  Then,  given  (II. D. 4. 13),  it  was   possible 

n  _ 
to  estimate  o^  and  o^,  by  minimizing  the  sum  of  squares  J^   S^  where 


;  =  R2  -  0^  -  o^.X^  , ,  (II.D.4.17) 

n    n    e    X'  n^l 


and  R^  =  (X  -'TX   ,)^  and  Y  is  from  (II. D. 4. 15).   They  derive  the 
n      n    n^  1 


properties  of  the  trivariate  distribution  of  the  estimator  of 

2  2 

Since   o^   and   o,5,  are  related  parametri  cally    in   a,    and 

£  K  I 


(Y,    al,    0^,). 


6.,  the  results  in  [Ref.  16]  concerning  the  variances  do  not  apply  in 
the  NLAR(1)  process.  However,  we  can  form  from  (II. D. 4.13)  and 
(II. D. 4. 10)   an  analogous   expression  for 
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S     =   R^  -   a,BMl-a.)X^^  -   2{]-a,$^)  ,  (II.D.4.18) 

n         n         11  I      ri"^!  i    i 


where  the  product   a  3^    in  R  is  not  replaced  by  Y  from    (II.D.il.  15)  . 

In  terms  of  a   sample   from    {X    },    we   define   the   joint 

least   squares   estimators  of  a     and   B.    to  be  those  values  a,    and   B.    that 
minimize 


n 

I   {(x.-a^6^x.^^)^  -  a^(1-a^)B^^x^^^    -  2(1-a^B^)}S  (II.D.4.19) 


where  (II. D. 4. 19)  is  the  sum  of  the  squares  of  S  given  in  (II. D. 4.  18). 
Now  it  is  clear  that  (II. D. 4. 19)  is  a  highly  nonlinear  expression  in  two 
unknowns,  a.  and  6,.  A  given  numerical  technique  could  converge  to  a 
local  extremum,  a  saddle  point,  or  diverge  depending  on,  among  other 
things,  the  starting  values  for  estimating  a.  and  B  . 

Constraining  the  nonlinear  optimization  problem  given 
by  (II.D.4.19)  to  the  rectangle  within  which  the  NLAR(1)  process  is 
defined'^-'O  ^  a.  ^  1  and  '^l  ^  B.  ^  1  ^-'eliminates  the  divergence  problem, 
but  clouds  the  estimation  issue  regarding  the  boundary  models  LAR(1)  and 
TLAR(l).  We  try  an  unconstrained  approach  described  below. 

(M)  An  Unconstrained  Nonlinear  Optimization  of  (II. D. 4. 19). 
It  is  easy,  but  tedious,  to  write  the  normal  equations  from  (II. D. 4. 19). 
One  critical  point  is  at  a  =  B.  =  0.  After  factoring  a  from  the  one 
equation  and  B.  from  the  second,  several  iterations  of  the  Newton^ 
Raphson  method  (see,  for  example,  Gerald  [Ref .  28:  pp.  122" 128])  can  be 
performed  to  find  other  critical  points.   The  Newton-'Raphson  method  uses 
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a  second^order  Taylor  series  approximation  to  solve  the  non^linear 
system  by  a  set  of  linear  Jacobian  equations.  However,  one  needs  to 
calculate  the  four  second  partial  derivatives  from  (II.D.4.19)  and  to 
have  a  good  starting  point  on  the  surface. 

The  IMSL  routine  ZSPOW  solves  systems  of  non-'linear 
equations  for  one  root  using  modified  Newton  methods.  This  routine  was 
used  to  solve  the  unconstrained  problem  of  finding  a  and  6  from  sets 
of  data  from  simulated  NLAR(1)  processes.  The  routine  was  very  senstive 
to  starting  values  and  did  not  always  converge  even  when  the  sample  size 
was    as    large    as    2500.       It    also    did   not    perform   well    when    the    true 

correlation  coefficient,    Y  =  a.B,,    was   small    for    any   of    the    simulated 

Ikl 
NLAR(1)    processes    with   the   same  autocorrelation  function,   Y'    '.      This 

problem  is  highlighted  by  the  fact   that    (II.D.U.19)    is    constant    along 

the   line  a.    =  0  and  the  line   6,    =  0. 

As  an  illustration  of   the   performance   of    the    routine, 

500  sets   of   sample  sizes   250  and  2500,    respectively,   were  generated  from 

the  NLAR(1)    process  with  a,    =    B,    =    .8.      The    scatter    plot    analyses    in 

Figures    II.D.4.8    and    II.D.4.9    show    how    the    estimators    a      and    6. 

determined  by   ZSPOW  are  related.      Especially    for    the    samples    of    size 

250,    there    is    the    same    pattern  of   the  hyperbola  as   seen   in  the  moment 

estimators    of    a      and    6.    given    in   Section    II.D.4.b.(2).       From    the 

accompanying    tables,    it    is    clear    that    the    variance   of    the   marginal 

distributions   for   each  estimator   ot.    and   6.    is  decreasing  with    increased 

sample   size.      The    Normal    plots    of    the    empirical    marginal   cumulative 

distribution  functions  for   a      and    for    6,    appear    very   non^-Normal    even 
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from  estimators  derived  from  samples  of  2500.  On  the  other  hand,  the 
Normal  plots  of  Y  =  a. 6,  indicate  that  the  distribution  is  converging  to 
a  Normal  distribution  as  required  by  theoretical  results  of  the 
previous  subsection,      (See  Figure  II. D. 4. 10). 

It  is  convenient,  at  this  point,  to  summarize  the 
results  on  the  moment  and  least  squares  estimation  of  Y  =  oi,B-  and 
(a   ,6   )    in  the  NLAR(1)    processes. 

In  the  estimation  of  Y,  only  second'^or der  product 
moments  are  required  for  both  methods.  From  the  Normal  probability 
plots  in  Figures  II.D.M.3  and  II. D. 4. 10,  it  appears  that  both  estimators 
of  Y  are  converging  to  Normal  distributions.  Although  the  moment 
estimator  of  Y  is  unbiased  (the  least  squares  estimator  is 
asymptotically  unbiased),  the  variance  of  the  moment  estimator  of  Y  is 
considerably  larger  than  that   of   the  least  squares   estimator   of   Y. 

The  estimation  of  a.  and  6,  requires  fourth-border 
product  moments  for  both  methods.  The  variance  of  the  moment  estimators 
of  a.  and  g  are  too  large,  even  for  samples  of  size  2500  to  be  useful 
in  distinguishing  between  NLAR(1)  processes.  The  least  squares 
estimators  of  a.  and  B  have  smaller  variances  than  the  corresponding 
moment  estimators  and  could  be  useful  in  distinguishing  between  NLAR(  1  ) 
processes.  However,  as  pointed  out  above,  the  numerical  routine  to  find 
the  critical  points  does  not  always  converge  for  a  given  starting  value 
of  a  and  B  .  The  conclusion  is  that  neither  method  of  estimating  a, 
and   6,    is  very  satisfactory. 


78 


.I l^gj.\j j I I J_J J I 

f----T---T-"l---r---^---^--'^VT^Vn----r 

r---T---T-i---p---i r--T-H----i---*T 


•^fe 


8 

8 

8   8 

8       8       8 

8  8 

: 

S 

S  S 

S       S        S 
31lihQ0d3d 

O      M 

T               1              i 

■:--'-v— ^ 

1              1 
1              1 

.^i — 
J :... 

1  1 

Lla... 
\  1 

.... 

L.... 

U--J 

.. 

.... 

1)11 
1)1) 
)    )    )    1 

p--« 

1     ^^ 

r---- 

... 



....; 

1             1 

1              1 
1             1 

1             1 

^!^^=^ 

, 

1 

1 

1 
t 

w^^ 

'A 

b; 

^-...i 

^fe 


-i 


i 

8 

8   8 

8       8        8 

8   8 

s 

S 

S   S 

5       8        S 

O      M 

1          x. 

:i  : 

1 
1 

<        • 

1 ^^ 

Zl^ 

^ 

^ 

1 

1 
1 

1 

1 — 4-- 

— i«. 

1      1 

^ 

o 

g 

col 


8        8 


8        8        8 

tf>  Q  n 

r^  <n  r« 

niirGDyid 


8   8 


8 

8   8 

8       8       8 

8   8 

S 

S  S 

R        8        S 
JllJ>GDy3d 

o    m 

7~— ' 

1 

1  ' 
1 

• 

"^ 

1 
1 

1 

1 

,---- 

^ 

— - 

n-  — — r---^ 

-  — , 

.1 

. 

j^^--^™j^ 

w... 

1              1 

?^ 

^*^^r^j 

5 


8 

8 

8   8 

8        8       8 

8   8 

S 

S 

S   S 

C        8       S 

niihGoaad 

o    to 

§ 

1 ^                    !_               J         J                     ^                           J 

_m_            ^s. 

>i^' 

— - 

...  J 

—-;-—; 

,            ^      ^ 

,j — ; — 

^ 

s 

•4- 

1 

■  1 

)         1         1         1 
)         1         1         1 
)         1         1         1 

)         )         )         1 
1         1         1         1 
)         1         1         1 

... 

-■l- 

~-~- 

II 

>- 

c 


CD   ° 

Ofe 

a: 
o 

-I 

I 

•»  5 


8        SS8        8        888        8 

«l  •         •  Pv  JS  IN,  ^ 

3^ll^QDa3d 


00 


CO. 
r-l    II 
OQ  rH 

a 
a   -p 

•H 


O 
o 
in 


cn 
o 

s    c 

■H       to 

+J 

m   o 
W   in 

(/] 
0) 

m 

tr 

CO 

4J 

C/J 
H3 
0) 


O 

cn 

■p 
o 


V) 
0) 
N 
•H 

cn 

o 

y) 

(1) 

,H 

C 

e 

cn 

O 

O 

in 

u 

c 


>i    cn 


•H 
•H 

X! 

o 
u 


(0 

u 
o 

2 


O 
O 
U 


(X 


x: 

4-> 


Q 


0) 
>-l 

3 

•H 


79 


(5)   The  Median  (X./X.^  )  Estimator  of  Y  =  a, 3^.   The  median 

of  (X./X.  ,)  was  seen  to  be  extremely  efficient  in  the  LAR(1)  process. 
1  1^1 

It  also  makes  sense  in  the  context  of  maximum  likelihood  estimation  in 

LAR(1).   This  is  discussed  in  the  next  section. 

Simulation  results  confirm  the  conjecture  that  the 

median  (X./X.  ,)  is  not  a  robust  estimator  of  Y  for  departures  from  the 
1   1^1 

LAR(1)  process.   In  fact,  from  the  boxplots  in  Figures  II. D. 4. 11  - 

II. D. 4. 14  of  SIMTBED  output  for  four  NLAR( 1 )  processes,  the  estimators 

seem  to  become  more  biased  as  6.  approaches  one — corresponding  to  the 

other  boundary  process,  TLAR(1).   Even  for  the  small  size  of  the 

simulations,  the  standard  deviation  of  the  mean  is  small.   For  the  three 

non-LAR(l)  models,  the  asymptotic  estimates  of  the  mean  of  Y  given  in 

the  data  are  each  significantly  different  from  the  theoretical  value  of 

Y  =  .64. 

d.     Method  of  Maximum  Likelihood 

(1)       Introduc t  i  on .       The    logarithm    of    the    likelihood 

function,    L(a.,B,),    is  obtained  by  taking  the  natural   logarithm  of   the 

n-dimensional   joint   density  given    in    (II.D.2.4)    and   treating    it    as    a 

function  of   a,    and   6,    for   a  given  realization  of   length  n  from    {X    }.      We 
11  °  n 
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have 


L(a^,6^)    =   -n(Jln2)    -    |xj    +     I     iln[a^  ( l-p2)exp{- |x. -6^x^_  J  } 


i=2 


+   (1-a^ ) (1-P2)exp{-|x. I }    +  a^p^Aexpl -A | x . -6^x . _ J } 


+    (1-a^ )P2Aexp{-X|x    | }], 


(II.D.U.20) 


where  p^  was   given   in    (II.D.13)    and   A   =  /(l-a,)6^    . 

Maximizing  (II.D.M.20)  in  the  general  NLAR( 1 )  model  is 
not  accomplished  here  for  two  reasons.  First,  L(a.,6.)  is  not 
dif  f  erent  i  able   with   respect    to    6,    at   any  of   the  n   values    6.    =  x./x._ 


for   i   =   1 n,   because  of   the  terms     x.-6,x.    , 

'    1      1    1-1 

routine  that   does   not   use  derivatives   is  needed. 


A   bivariate   search 


Second,  L(a.  ,6.)  is  not  defined  along  the  line  a.  =  1 
at  any  of  0  <  k  S  n  values  of  6.  such  that  -1  <  6,  =  x./x._  <  1.  To 
see  this,  examine  the  third  term  of  the  natural  logarithm  in 
(II. D. 4. 20).     We  have  replacing   X   for   all    i   =   2,...,n 


a^P2 


/(1-a^)6^ 


exp 


-Ix-B^x... 


/(l-a^)3^ 


(II. D. 4. 21) 


Because  of  the  presence  of  the  exponential  term  in  (  II . D . 4 . 21 ) ,  the 
limit  as  a  approaches  one  is  zero,  so  long  as  6.  *  x./x._  .  The  limit 
does    not   exist   on  the   set   B    =    {bJB,    =  x./x._    ;    i   =   2,...,n}. 
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It  is  worth  noting  that  for  a,  =  1  ,  corresponding  to 
the  LAR(1)  model,  and  except  on  the  set  B,  (II. D. 4. 20),  can  be  written 
as 


L(1,3^)    =  -n(iln2)    -    |xj    +   (n-1 )  iln(1-ep 


n 
-      I    |x.-6x._    |,        B      (!f  B.  (II. D. 4.22) 

i=2      ^ 


Now  X,n(1-6p  is  maximized  at  6.  =  0  and  the  optimal  value  for 

n 

y    |x.-B,x.     ,  I    is    the    least    absolute    deviation    (LAD)   estimator   of    B. 
1=2       ^       ^     ^-^'  1 

which   is   the  weighted  median  of    (x./x._    )   where  the  weights   are  x._      for 

i   =   2,...,n.      Thus,    if   after   a  large  number   of   observations  from    {X   }   no 

n 

repeats   of   x./x.    ,    are  observed,    then  there   will    be    little    difference 
1      1-1 

between    a    particular    LAR(1)    model    and   the   completely  random  model   of 
i.i.d.    Laplace   variables.      In  this  case,   for   any   B.     in    a   small    deleted 

neighborhood  around   B.    =  med(x./x._    ),    (II. D. 4. 22)    will   be   large   because 

n 

both   2,n(1-B^)    and     T    |x.-B,x.    ,1    will   be  optimized. 
1  . ^    '    1      1    1-1  '  ^ 

1  =  1 

(2)  The  Maximum  Likelihood  Estimator  of  a  in  the  TLAR(  1 ) 
Processes .  In  this  section,  the  likelihood  function  for  the  TLAR(1) 
process  is  described.  The  maximum  likelihood  estimator  is  found  using  a 
numerical  iteration  scheme.  The  properties  of  the  estimator  are 
investigated  and  compared  to  the  least  squares  estimator  using 
simulation . 


86 


For  the  TLAR( 1 )  models  (6.  =  1  or  B^  =  -1),  (II. D. 4. 20) 
can  be  written  as  a  one-dimensional  function  of  the  a  variable  a.  We 
have 


L(a)    =   -n(J,n2)    -    |x    |    +     I      in 

i=2 


'1  I       '    1 

expi 


/1-a^  Vl-a^ 


-    X 


+   /l-a,    exp> 


VI -a. 


(II. D. 4. 23) 


where 


V.     = 

1 


X.    -   X.    ,        a  ^   0, 

11-1 


X.    +  X.    ,        a   <   0, 
11-1 


(II. D. 4.24) 


-1    <   a   <    1        and        a. 


a    . 


Now   L(a)    is  continuous   everywhere   in  the  open   interval 

(-1,+1)    and  dif f erentiable  everywhere  except   at   a  =  0.      The  expressions 

_         dL(a)         ^    d'L(a)  .         ^ .,  ^  k 

for    —r and   — ] — 5 —  are    lengthy  and  cumbersome  to  use;    hence   are  not 

da  da^  ^      '' 

given  here. 

Examples  of  the  likelihood  curve  are  given  in  Figures 
II. D. 4. 15  -  II.D.4.18.  Each  curve  was  generated  from  a  sample  of  100 
from  a  simulated  TLAR(l)  process  with  the  stated  a  and  6..  It  is  easy 
to  see  the  non-diff erentiable  point  at  zero  and  how  flat  the  curve  is. 
To  see  that  there  is  a  maximum    (annotated  with  x  on  the  figure)    in  these 
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curves,   the  second  part  of  each  figure  focuses  on  the  function   near    the 
true  value  of  a, . 

The  IMSL  routine,  ZXLSF,  a  one-dimensional  search 
routine  was  used  to  find  the  value  of  a  that  maximized  (II. D. 4. 23).  The 
starting  value  a  was  the  least  squares  estimator  of  serial  correlation 
given  by    (II. D. 4. 15). 

Using  500  samples  of  sizes  50  and  500,  respectively, 
from  simulated  TLAR(1)  processes  with  Y  =  .64,  the  scatter  plot  analyses 
in  Figures  II. D. 4. 19  and  II.D.4.20  were  completed.  The  least  squares 
estimator  and  maximum  likelihood  estimator  appear  to  be  correlated. 
From  the  accompanying  tables,  the  maximum  likelihood  estimator  appears 
to  have  a  smaller  variance  and  bias  than  the  least  squares  estimator. 
Analysis  of  the  boxplots  from  a  SIMTBED  comparison  of  the  least  squares 
estimator  and  the  maximum  likelihoood  estimator  reflect  the  same  results 
(see  Figures    (II. D. 4.21    -   II. D. 4. 22). 

From    the    Normal    plots   given   in  Figure  II. D. 4.23,    both 
the   least  squares    and   the   maximum    likelihood   estimator    appear    to    be 
coverging  to  a  Normal   distribution.      There  are  three  or   four   outliers   in 
the   tail   out   of   500  points. 
E.       OTHER   CASES    OF   THE    NLARMA(p,q)    MODEL 
1 .      Introduction 

A  primary  advantage  of  the  NLARMA(p,q)  model  is  the  ease  with 
which  the  basic  framework  can  be  altered  to  cover  a  variety  of  different 
dependency  structures.  The  NLAR(2)  and  NLAR( 1 )  processes  have  been 
examined  closely   in   the    previous    sections    of    this    chapter.      At    this 
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time,  the  moving  average  first-order  model,  NLMA(1),  and  the  mixed 
model,  NLARMA(1,1),  are  briefly  considered.  The  correlation  structure 
and  parameter  space  are  discussed  for   each  model. 

The  TLAR(I)  model  for  which  the  maximum  likelihood  estimation 
was  completed,  can  be  easily  extended.  As  the  final  part  of  this 
section,  we  present  the  p  -order  autoregressive  processes  for  arbitrary 
p  S  2.  The  conditions  for  existence  and  uniqueness,  the  correlation 
structure  and  likelihood  function  are  given.  The  maximum  likelihood 
estimation  scheme  for  the  p  parameters  is  also  discussed. 
2.      A  Backwards  MA(1)    Model,    NLMA(1) 

a.     Correlation  Structure  of   the  NL^4A(1)    Process 

From    (II. D. 1.1),   we  see  that    X      is    the    random    coefficient 
sum   of    independent    variables    each   of    which   have    a  marginal    Laplace 

distribution.      Therefore,    we    can   replace   X      ,    by    another    Laplace 

^  n-1       ^  ^ 

variable.       If    it    is    independent    of    L      and   has    a   standard   Laplace 

marginal   distribution,   then  by  the   construction,    X      will    still    have    a 

n 

standard  Laplace  marginal   distribution. 

If   we   replace   X      ,,    in   fact,    by    L      ,     in    (II. D. 1.1),    we 
y  n-1  '      ^      n-1  ' 

obtain  the  following  expression  for  X 

n 


X      =   K'6,L      ,    +   K  L      ,  (II. E. 2.1 

n  n    1    n-1  n  n 


where  { K '  }  and  (L  }  are  as  given  in  (II. D.I. 2)  and  {K  }  is  the 
corresponding  two-valued  discrete  variable  as  given  in  (II.C.2.4)  for 
the   NLAR(2)    model. 
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Since  X  _  is  by  construction  in  (II. E.  2.1)  independent  of 
X  for  |k|  ^2,  we  see  that  the  model  has  the  cut  off  property  of  a 
linear  MA(1)  model.  The  maximum  range  of  correlations  in  any  MA(1)  is 
less  than  or  equal  to  |l/2|,  (Fuller  [Ref.  29:  p.  62]).  This  range  is 
achieved  by  the  linear  MA(1)  models.  Some  of  the  random  coefficient 
MA(1)  models  have  been  shown  to  have  a  maximum  range  for  the 
Corr(X    ,X  _    )   to  be  strictly  less   than  one-half    (see  Hugus    [Ref.    30]). 

Using  (II. E. 2.1)  recursively,  we  derive  the  serial 
correlation  in  NLMA(1)   as 


Cov(X    ,X        ) 

Corr(X    ,X      ,)    =  "    '       , 

Var(X    ) 
n 


E{(K'B,L        +K  L    )X      , } 
n    1    n-1      n  n     n-1 


-i^^^ViVi^ 


^E{L^.^(K;_^6^X^_2^K^_^L^_^)}. 


=  a^B^E(K^_^).  (II.E.2.2) 


Substituting    in   the    values    of    the    i.i.d.    sequence     {K    }    with    the 
o  n 

corresponding  probabilities   p„,    I'Pp  from    (II. D.I. 3)   we  have 
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Corr(X    ,X      ,)   =  a,B,{(1-p,)    +  /(1-a,)3^   P,} 


a^B^i 


1-B^    +  a^3^/(1-a^)B^ 


1    -    (l-a^)B^^ 


(II.E.2.3) 


Figure  II. E. 2.1    is  a  contour    plot    of    the    level    curves    for 

p(1)    =    Corr(X    ,X      ,).      Notice    that    in   this  model,   the   correlation  is 
n      n- 1 

restricted  in  range  over  that  of  the  linear  MA(1)  models.  Using  the 
IMSL  global  constrained  optimization  routine,  ZXMWD ,  with  multiple 
starts,  the  extremes  for  lag-1  serial  correlation  are  | p(  1  )  |  ^  0.4026, 
occurring  at  a,  =  .903  and  6,  =  ±.690.  In  Chapter  III,  we  give  a 
continuous  random  coefficient  model  with  MA(1)  correlation  structure, 
Laplace  marginal  distribution,  and  the  full  range  of  correlations,  i.e. 
|p(1)|    ^5. 

b.      Invertibility  in  NLMA(1) 

It   is  well   known    (Chatfield   [Ref.    31,    p.    43])    that   if 


n  n  1    n-1 


(II.E.2.4) 


is  a  linear  MA(1)  model,  then  substituting  (1/6.)  in  for  6.  does  not 
change  the  autocorrelation  function.  This  implies  that  the  linear  MA(1) 
model  is  not  uniquely  determined  by  its  autocorrelation  function. 

It  is  also  well  known  (Chatfield  [Ref.  31:  p.  ^3])  that  by 
successive  substitution,  the  MA(1)  model  in  (II.E.2.4)  can  be  written  as 
the  infinite  aut  or  egression 
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^n'^n-  8lVl    *   e^Vz  -*••••  (II.E.2.5) 


Likewise,    if   1/6.    is  in   (II.E.2.4),   we  have 

Z      =X     -  I  y.      ,+T2X     ^   -  + (II.E.2.6) 

n         n       6     n-1        6^  n-2 

Unfortunately,  only  one  of  the  two  processes  given  by  (II.E.2.5)  and 
(II.E.2.6)  yields  a  convergent  power  series  depending  on  whether 
IbJ  <  1  or  not.  Hence,  the  restriction  on  $  called  "invertibility"  by 
Box  and  Jenkins  [Ref.  23:  p.  50],  guarantees  a  one-to-one 
correspondence  between  a  linear  MA(1)  model  and  its  autocorrelation 
function  by  restricting  Q  to  be  such  that  the  MA(1)  "inverted"  infinite 
autoregression  is   the  one  with  a   convergent   power  series   representation. 

This  definition  of  invertibility  is  not  totally  applicable 
to  random  coefficient  models  (such  as  NL^4A(1))  with  MA(1)  correlation 
structure  because  it  has  not  been  established  that  there  exists  a 
corresponding  infinite  autoregression  model. 

Likewise,  there  can  be  an  infinite  number  of  models  that 
have  the  same  autocorrelation  function  and  marginal  distribution.  This 
is  the  case  in  NLMA(1).  As  was  seen  in  Figure  II.E.2.1,  each  contour 
line  corresponds  to  a  constant  value  of  p(1)  and  is  achievable  by  an 
infinite  number   of   combinations  of    (a., 6.). 

The  purpose  of  this  section  then,  is  to  find  a  different, 
but  meaningful,   way  to  restrict   the    (a   ,6.)    rectangle  in  Figure  II.E.2.1 
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which:  (1)  does  not  further  restrict  the  range  of  p(1);  and  [2)  which 
within  the  region  the  NLMA(1)  model  must  be  uniquely  determined  by  p(l) 
and  either   a.    or    B,  . 

From  the  contours  in  Figure  II.E.2.1,  it  appears  that  the 
feasible  region  for  p(1)  can  be  partitioned  in  such  a  way  that  the  two 
goals  stated  above  can  be  achieved.  It  is  not  known,  however,  if  this 
partition  can  be  described  analytically.  Figure  II.E.2.2  is  an 
illustration  of  the  partition  into  a  center  region  and  two  complementary 
disjoint  regions.  The  center  region  is  roughly  defined  as  the  region  to 
the  right  of  a  line  from  (-1,.667)  to  (-.577,1)  and  to  the  left  of  a 
line  from  (.577,1)  to  (1,.667).  Both  lines  cut  across  the  contours  in 
the  depression  on  the  left  and  on  the  ridge  on  the  right.  The  center 
region  is  more  advantageous  for  two  reasons.  First,  p(  1  )  is  a 
continuous  function  of  a,  and  6.  in  the  center  region.  Secondly,  the 
parameter  estimation  is  more  likely  to  be  easier  if  the  most  extreme 
values  of  a.  and  6.  can  be  avoided  simultaneously.  Therefore,  we  shall 
call  the  center  region  of  Figure  II.E.2.2  the  "principal"  region. 
3.      A  Mixed  Autoregressive-Moving  Average   Model,    NLARKA(1,1) 

From  the  theorem  in  Section  II. C. 2,  we  see  that  any  two 
(possibly  dependent)  Laplace  variables  can  be  combined  with  an 
independent  set  of  (again,  possibly  dependent)  Laplace  variables  to  form 
another  Laplace  variable.  Using  this  property,  if  we  replace  X  _  in 
NLAR(2)  by  L  _  ,  then  the  marginal  distribution  of  {X  }  is  still 
standard  Laplace.     We  have  then 
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X  =  B.K'X   ,  +  B^K"L   ,  +  K  L  ,  (II. E. 3.1) 

n    1  n  n-1    2  n  n-1    n  n 


where  {K',K"},  {L  },  {K  }  are  as  previously  defined, 
n  n     n     n 

Notice  that  if  K'  is  identically  zero,  corresponding  to  a  =  0, 
we  obtain  an  expression  of  the  form  given  by  (II.E.2.1)  for  NLMA(1). 
Likewise,  if  K"  is  identically  zero,  we  have  the  NLAR(1)  model  as  given 
in  (II. D.I .1)  . 

The  NLARMA(1,1)  model  has  the  same  correlation  structure  as  the 
linear  mixed  model  AR^4A(1,^).  Using  (II.  E.  3.1), 


E(X^X^  .)    =  a.6,E(X^  .)  +  a„B,E(L   . X^  .) 
n  n-i     I  1   n-1     2  2   n-1  n-1 

+  E(X   ,K  L  )  .  (II.E.3.2) 

n-1  n  n 


But  X      , ,   K     and  L     are  independent  so 
n-1        n  n  ^ 


^^S^Z^'VlVa'    *E(L=.,K^.,)).  (II.E.3.3) 


Conditioning   on   K      ,,    using   the    independence    of    {L    }    and 
°  n- 1  n 

(^  _P'    ^  -1^   ^^^  dividing  by  the  Var(X    )   we  have 


p(1)    =   a^B^    +  a262(l~P2"P3*i^2|P2*l^3|P3^    '  (II.E.3.4) 
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where   p^,    Po,    1^2!'    I^^l    ^^^  defined  in    (II.C.2.5)    through    (II.C.2.9). 
For   1^2,    (II.E.3.3)    and   (II.E.3.4)    become 


^'Wi'   '"l^l^VlVi*  CII.E.3.5) 

and 

p(J,)    =  a^B^pCil-l).  (II. E. 3.6) 


These  equations  are  the  same  as  those  of  the  ARMA(1,1)  model 
(see  Chatfield  [Ref.  36:  p.  58]).  However,  the  range  of  correlations 
is  significantly  reduced  over  that  of  ARMA(1,1).  Figure  II. £.3.1 
represents  a  side-by-side  comparison  of  the  (p(1),  p(2))  space  for 
NLARMA(1,1)  and  the  familiar  linear  ARMA(1,1).  Although  p(1)  can  range 
from  -1  to  +1,  the  combinations  with  p(2)  are  severely  limited  in 
NLAR^4A(  1  , 1 )  .  The  minimum  p(2)  in  NLARMA(1,1),  found  numerically  using 
the  reduced  gradient  method  is  approximately  -.025  at  p(1)  =  ±.2.  As 
|p(1)|    increases,    p(2)   approaches    p(1)^. 

4.      Higher  Order  Autoregressive  Models,    TLAR(p) 

a.      Introduction 

It  has  been  stated  by  Raftery  [Ref.  32]  that  there  exists 
NEAR(p)  models  for  p  S  2.  Also,  Nicholls  and  Quinn  [Ref.  16]  have  given 
conditions  for  the  existence  and  uniqueness,  strict  stationary,  etc., 
for  general  RCA(p)  models.  However,  only  for  the  NEAR(2)  and  the 
NLAR(2)  processes  has  it  been  shown  explicitly  what  the  necessary 
innovation  is;    and  that   it   is  a  valid  random  variable. 
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For  p  ^  3,  this  has  not  been  accompli  shed  for  the  general 
NEAR(p)  process;  nor  is  it  done  now  for  the  NLAR(p)  process.  However, 
there  are  2  different  p  order  autogressive  models  with  p  parameters 
that  are  special  cases  of  the  NLAR(p)  process.  These  models  are  called 
the  TLAR(p)  models.  The  innovation  for  the  second-order  model  was  given 
without  proof  following  the  theorem  in  Section  II. C. 2.  The  likelihood 
function  and  maximum  likelihood  estimation  of  a  was  given  in  Section 
II. D. 4  for  the  TLAR(1)    processes. 

The  TLAR(2)  models,  including  the  two  TLAR(1)  models  only 
account  for  four  of  the  infinite  number  of  NLAR(2)  models  which  all  have 
the  same  AR(2)  correlation  structure  and  standard  Laplace  marginal 
distribution.  Since  there  is  a  variety  of  different  sample  path 
behaviors  obtainable  in  the  general  NLAR(2)  model,  it  is  possible  that  a 
TLAR(2)  model  will  not  always  be  the  most  appropriate  model  for  a  given 
set  of   data. 

However,  as  is  shown  in  the  remainder  of  this  section,  the 
TLAR(p)  models  have  an  advantage  over  the  general  NLAR(p)  models.  The 
TLAR(p)  processes  for  p  ^  3  exist;  are  easily  constructed;  are  partially 
time  reversible;  and  are  parsimonious  with  respect  to  parameters.  The 
parameters  in  the  TLAR(p)  process  are  easily  estimated  from  the 
conditional  likelihood  function  by  the  method  of  maximum  likelihood, 
b.      Existence   and  Uniqueness 

The  TLAR(p)  models   p  ^    1    have  the  form 


?      (i) 
X      =      I   K^^X      .    +    e    ,  (II. E. 4.1) 

n        .    ,    n       n-i  n 

1  =  1 
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where  {X^}  is  assumed  stationary  with  Laplace  marginal  distribution; 
{K  ,...,K  }  =  K  is  a  p-variate  discrete  random  variable  independent 
of    {e^}   and  X^_^ ,   X^_2 , For   all   n 


(1 ,    0,    0,  ..  .,    0)      w.p.    a. 


(0,    1  ,    0,  . . . ,    0)      w.p.    a. 


(0,    0,    0,  .  .  .,    1)      w.p.    a 

^      P 
(0,    0,    0,...,    0)      w.p.    1    -    Z   a.    =    X    >   0,  (II.E.i4.2; 

i  =  l    ^ 


so  E(K   )  =  a.  for  all  i  =  1, ,p.   The  2^  choices  of  model  arise  from 

n       1 

the  selection  of  signs  for  each  of  the  X  _.  (either  +1  or  -1). 

Now  if  {X  }  is  stationary,  then  the  following  expression  for 

the  characteristic  function  of  the  i.i.d.  innovation,  e    ,  follows  from 

n 

(II. E. 4.1)    regardless   of   the   choice  of   signs   on  X   _..      (The   distribution 
of   a  symmetric  random  variable  Z   is  the  same  as   that   for   -Z) .     We  have, 


<p(i^)    =   E[exp{-ia)(    I  k\      X^_.+e^)}], 
n  1  =  1 


)    (cj)    [    I   a.    4)y    (w)    +    X], 

£  .,1a. 

n  1=1  n-i 


109 


)    (w)    [(1-X)    <()„(w)    +   X], 
n  n 


from  conditioning  on  K    ,   the   stationarity   assumption   of    {X    }    and   the 

n  n 

independence   of    e      of    X      ,,    X      „ Therefore,    substituting  from 

n  n- 1         n-d 

(II. B.I. 2) 

e  1  +03        [\  +0)  I 

=   1/(1+X(i)2).  (II.E.4.3) 

For   X   >  0,    (II.E.4.3)    is  recognized  as   the   characteristic  function  of    a 

scaled  Laplace  random  variable  with  scale  parameter   /x~ . 
Since    (II. E. 4.1)    can   be  written   as 


X      =      I    {a.X      .    +    (K^^^-a.)X      .}    +    e  (II.E.4.4) 

n        .    ,      1   n-i  n  i      n-i  n 

1  =  1 


and  satisfies   the   conditions   in  Section  II. C. 2,    the    TLAR(p)    models    are 

RCA(p)    models.      Since    the    innovation    {e    }    and    (K    }    are  i.i.d.,    then 

TLAR(p)   are  strictly  stationary  and   {X   }    is  the   unique    solution   by    the 

theorems  of  Nicholls  and  Quinn   [Ref.    16:      p.    31    and  p.    37]. 

c.      Correlation  Structure 

The  TLAR(p)   models  are  p     -order   autoregressive   in  the  sense 

that   E(X    Ix      ,    =  X,,   X     ^   =  x^,...,   X  =  x    )    is   a   linear    function    in 

n'    n-1  1        n-2  2  n-p  p 

X.,     i    =    1,...,p.       It    is    also   autoregressive    in   the    sense    that    it 
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satisfies   a  set  of  Yule-Walker    equations.      Multiplying    (II. E. 4.1)    by 
X      . ,    i  Z   ] ,   and  taking  expectations,   we  have 


^'■W-i^  -  ^^'ViVi'  *•••  V'VpVn'-  (ii.E.'..5) 


Dividing   by    Var(X    )    and   substituting    i    =    1,...,p   into    (II.E.4.5),   we 
have  the  set  of   equations 


p(1)    =  a^    +  a^pd)    +...+a   p(p-1) 
p(2)    =  a^pd)    +  a^  +...+a    p(p-2) 


p(p)    =  a^p(p-l)    +  a2p(p-2)    +...+  a    ,  (II.E.i4.6) 


where  a.    =  a. (Sign  of   X      .)   for   all    i   =    1,...,p. 
11  n-i 

For   the  TLAR(2)    cases,   the    (p(1),    p(2))   admissible  region  is 

the  entire  diamond  given   in  Figure  II.C.3.1.       It    is    divided,    however, 

into   four    right    triangles,   one  per   quadrant,   corresponding  to  the  sign 

of  X      ,    and  X     „   in  the  model. 
n-1  n-2 

d.      Conditional   Density  of   X    Ix      ,  ,X     „,..., X 

•^  n '    n-1      n-2  n-p 

The  conditional  density  for  each  of  the  2  specific  choices 
of  signs  are  easily  found  noting  that  the  conditional  probability  is 
just  a  sum  of   Laplace   cumulative   distributions.     We  have 


in 


a,P(/x~L     <   x-xj    +...+  a   P(/r~L      <   x-x    )    +   XPC/T'l      <   x) 
1  n  I  p  n  p  n 


(x-x    1                           1 x-x    I 
a,F,  I ^J   +...+  a   F,  I ^)   +    XF.  f-^l ,  (II. E.  4.7) 


where  F  (•)  is  the  cumulative  distribution  function  of  a  standard 

Li 

Laplace  random  variable.      Taking  derivatives   with  respect   to  x,   we  have 


^X    IX      X         ^^l^1'--"^p^    =-=   J/i^Lf-irj    ^   ^^L^    ' 


nl"n-1 n-p  ^  /X      i  =  1    "    "^      "    ^  "^/X 

(II.E.il.8) 


e.      Conditional  Maximum  Likelihood  Estimation  of    (a    ,...,a    ) 

Since  there  are  many  p-variate  Laplace  distributions  that 
(X  ,...,X  )  could  have,  and  that  the  particular  one  is  not  known  to  us, 
it  is  not  possible  to  form  the  exact  likelihood  function  which  is 
written 


n 

ly               Y         ~l  yIy                             Y             )  Y                      Y      '                    ^IX.Cj.M.y, 

A      ...A                [  .  A.      A.           ,...,A.          S  A      ,...,A^ 

n          1        U=P'^1  1  '    1-1               i-pj        p              1 


Instead,    we    can    calculate   the    conditional   log-likelihood 
function  as   the  logarithm  of    the    product    of    the    first    (n-p)    terms    in 
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(II.E.M.9).  This  is  commonly  done.  See,  for  example,  Priestly 
[Ref.  33:  p.  350].  Using  a.  =  a.sign(X  _.),  we  have  the  following 
single  expression  for  the  conditional  log-likelihood  function,  given  the 
n  realizations  from  TLAR(p)  process,  written  as  a  function  of  a.  for 
i   =   1 ,  . . .  ,p. 


L(a    ^n^    "        ^      ^" 

^  i=p+1 


A.X.       .,...,A. 
1 '       1-1  1-p 


n 

I        in 
i=p  +  l 


P  V.  . 


/Tf, 


X  . 

1_ 


(II. E. 4. 10) 


where 


V. 


ij 


X.  -   X.     .  if   a .    ^   0,        j    =    1 ,  .  .  .  ,p, 

1  i-J                 J 

X.  +  X.     .  if   a.    <    0, 

1  i-J                 J 


(II.E.M.11) 


i   =  p+1,...,n;    a.    =    la. I    and   X  are  functions   of   the   variable  a.. 

We  see  that  when  p  =  1  (II. E. 4. 10)  and  (II. E. 4. 11)  give  the 
expressions   used  in  the  TLAR( 1 )    process   in  Section  II. D. 4. 

As    a   function   of    (a    a    ),    (II. E. 4. 10)    is    continuous 

throughout   the   interior   of   the    p-dimensi  onal    subspace    on   which    it    is 

defined.       It    is    not    dif f er ent i able   with   respect    to  a.    anywhere  that 

1 

a.     =   0.       The   maximization  of    (II.E.4.10)    can    be    formulated    as    a 

1 

constrained   non-linear    program   for   which   a   numerical    routine  would 
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probably  be  required  to  solve   for    (a^ a    ),    the   Joint    conditional 

likelihood  estimator   of    (a,, a    ). 

1  P 
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III.      CONTINUOUS    RANDOM  COEFFICIENT    MODELS 
WITH  SYMMETRIC   NON-NORMAL   MARGINALS 

A.      INTRODUCTION 

The  discrete  random  coefficient  NLARMA(p,q)  models  studied  in  the 
previous  chapter  offered  a  variety  of  different  dependency  structures 
analogous  to  their  linear  ARMA(p,q)  counterparts  as  described  in  the 
Box-Jenkins  approach  to  time  series  analysis.  These  models,  however, 
could  be  considered  deficient  in  some  ways.  For  one  thing,  all  the 
models  have,  by  design,  the  same  marginal  distribution,  i.e.  Laplace. 
To  obtain  a  different  marginal  distribution  would  require  starting  over 
to  develop  the  appropriate  innovation  sequence.  Rafter y  [Ref  .  32]  has 
reported  some  results  in  extending  the  NEAR  framework  to  other  models 
with  different  marginals  and  ARMA   correlation  structures. 

Furthermore,  the  parameter  estimation,  which  is  easy  to  do  in 
Gaussian  linear  AR(p)  models,  is  not  particularly  easy  in  the 
autoregressive  process  of  the  NLARMA(p,q)  family.  In  the  moving  average 
and  mixed  models  of  NLARMA(p,q),  the  maximum  likelihood  procedure  is 
even  more  difficult.  Raftery  [Ref.  32]  claims  that  the  maximum 
likelihood  estimator  of  6, in  the  NLAR(l)  process  would  be  super- 
efficient  based  on  his  work  in  parameter  estimation  in  the  NEAR(1) 
process  and  the  extensions  that  he  has  proposed.  Super-efficiency  is 
not   an   attractive   property  of   an  estimator. 
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Again,  the  moving  average  model,  NLMA(1),  does  not  allow  for  the 
full  range  of  correlations  that  are  obtainable  with  the  linear  MA(1) 
model . 

Finally,  note  that  there  is  another  attractive  property  of  the 
random  coefficient  models  that  is  not  fully  exploitable  in  the  di  screte 
random  coefficient  models  (NEAR(1)  and  NLAR( 1 ) ) .  That  is,  in  the 
NLAR^4A(p,q)  models  the  coefficients  of  the  process  can  change  somewhat 
over  time  and  the  process  itself  remains  stationary.  Andel  [Ref.  3^] 
has  noted  that  in  many  applications  of  time  series  analysis, 
particularly  in  the  fields  of  hydrology,  meteorology  and  biology,  the 
coefficients  of  the  model  are  attempting  to  describe  complicated 
processes.  The  coefficients  may  have  some  random  behavior  of  their  own, 
apart  from  that  usually  attributed  to  the  independent  innovation 
sequence. 

If  stationary  constant  coefficient  models  are  not  particularly  good 
at  modelling  such  systems  (as  suggested  by  Andel  [Ref.  3^]),  then  the 
NLARMA(p,q)  models  would  not  be  much  better  because  the  coefficients  are 
limited  to  a  finite  (very  small)  number  of  possible  values.  However, 
Lawrance  and  Lewis  [Ref.  6]  have  shown  in  the  case  of  NEAR(l)  that  it  is 
possible  to  alter  the  character  of  the  sample  paths  of  a  given  low-order 
aut  or  egress ion  by  extending  the  two-parameter  model  to  one  having  4 
parameters.  The  number  of  extra  parameters  could  be  excessive  and  the 
costs   in  parameter   estimation  unacceptable. 

In  this  chapter,  a  different  family  of  stationary  random  coefficient 
time  series  models  is  introduced  which  retains  many  of  the  favorable 
aspects    of    the    NLARMA    family    (specified  marginal    and    correlation 
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structure)  and  offers  alternatives  in  the  areas  pointed  out  above  as 
disadvantages   in  the  NLARMA   construction. 

The  symmetric  marginal  distribution  can  be  specified  by  one  shape 
parameter  to  be  any  one  of  an  infinite  number  of  non-Gaussian  distri- 
butions. This  family  is  the  i. -Laplace  family  and  is  examined  in  the 
next  section.  The  f amily--including  as  a  special  case  the  double 
exponential  (Laplace)  distribution--has  members  with  extremely  high 
kurtosis,  as  well  as  those  that  have  a  limiting  kurtosis  that  approaches 
that  of  the  Gaussian  distribution.  This  offers  a  significant  advantage 
over  the  NLARMA  models. 

Just  as  discrete  random  variables  are  needed  for  the  coefficients  in 
the  NLARMA(p,q)  models,  the  square  roots  of  Beta  random  variables  are 
used  in  this  family  of  models  to  maintain  the  i-Laplace  marginal  distri- 
bution. The  square  root  Beta  transformation  theorem  is  the  key  result 
through  which  all  the  time  series  models  in  this  chapter  are  formulated. 
By  the  theorem,  Laplace  variables  are  changed  into  those  that  have 
J,-Laplace  distributions.  Previous  uses  of  Beta  random  variables  in 
modelling  non-Normal  time  series  is  evident  in  the  models  with  Gamma 
marginals  of   Lewis   [Ref .    35]   and  Hugus    [Ref .    30]. 

The  fact  that  the  coefficients  are  continuous  instead  of  discrete 
allows  for  a  continuous  variation.  That  they  are  functions  of  Beta 
random  variables  restricts  the  variation  to  a  bounded  interval.  This  is 
likely  to  facilitate  the  modelling  of  those  "complicated"  systems  as 
described  by  Andel    [Ref.    3^]- 
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The  principal  models  investigated  in  this  chapter  are  those  with 
first-order  autoregressive  correlation  structure.  They  are  first-order 
Markov  processes.  For  the  purpose  of  discussing  parameter  estimation  in 
this  family  of  autoregressive  models,  as  opposed  to  the  NLAR(1)  family, 
the  focus  is  narrowed  to  that  AR(  1  )  model  of  the  family  with  Laplace 
marginals--the  so-called  Beta-Laplace  First-Order  Autoregressive  model, 
BELAR(I).  Several  point  estimators  of  location  and  scale  are  discussed 
and  examined  through  simulation  in  SIMTBED  [Ref .  15].  The  one  parameter 
which  uniquely  determines  all  the  correlations  of  lag  k  in  the  BELAR(l) 
model  can  be  estimated  by  a  least  squares  procedure  which  has  very  nice 
asymptotic  properties.  The  maximum  likelihood  estimator  of  serial 
correlation  is  also  obtained  using  numerical  methods. 

First-order  autoregressive  correlation  structure  is  not  the  only 
type  of  dependency  relationship  that  is  obtainable  from  using  the  square 
root  Beta-Laplace  transformation.  In  the  last  section  of  this  chapter  a 
first-order  moving  average  model  and  an  extension  to  a  q  -order  model 
are  introduced.  The  MA(1)  model  retains  the  full  range  of  correlations 
of  the  linear  MA(1)  models.  This  was  not  the  case  in  the  NLMA(1)  model. 
B.      l-Lk?LkCE    DISTRIBUTION 

1  .      The   JL-Laplace  Random  Variable 

It  was  shown  in  Section  II. B  that  the  standard  Laplace 
distribution  belongs  to  the  class  of  infinitely  divisible  distributions. 
The  probability  density  function  of  a  Laplace  distributed  variable  was 
given  in  (II. B.I).  The  characteristic  function  of  the  standard  Laplace 
random  variable  was   given  in    (II.B.2).      Thus   if 
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1   ^i 


(<^)  =  (t^)     .      «-  >  0  . 


(III. B. 1.1) 


then  X  is  a  random  variable.  In  fact  it  is  the  difference  of  two 
independent,  identically  distributed  Gamma(ll,l)  random  variables  where  I 
is  the  shape  parameter  and  1  is  the  scale  parameter.  Therefore,  if  X 
has  a  characteristic  function  given  by  (III. B. 1.1),  then  X  is  an 
g. -Laplace  random  variable. 

Since    (III. B. 1.1)    is    a   real    function   of    oo ,    X    i  s    a   symmetric 
random  variable.      It  is  easily  verified  that 


E(x'^)    = 


if     n  is  odd. 


(III. B.I  .2) 


{k*]{^^^l^^^     if     n  =  2k,     k   =   1,2 


[k] 
where    b  =  b(b+1 ) . . . (b+k-1 )    for   all   b  >  0.      Since  all  odd  moments   are 

zero  in    (III. B.I. 2),   the   Jl-Laplace   distribution    is    not    skewed   for    any 

I  >   0.      From    (III. B.I. 2)    we  find  that   the   kurtosis   is 


'2   - 


E(X^    -    (E(X^))^ 
n n 

Var2(X    ) 
n 


(2Jl)2      ~   ^         I    ' 


(III. B.I. 3) 


The   kurtosis    approaches    3   as    2,    -»•    <»,    which    corresponds    to   that    of    a 
Normal   distribution. 

Since  an  Jl-Laplace  random  variable,  X(!l),  is  the  difference  of 
two  i.i.d.  Gamma(il,1)  random  variables,  we  obtain  the  density  for  X(8,) 
by  using  conditional   expectations. 
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If    G    (£,,1)    and    G    ( 2, ,  1  )    are    the    i.i.d.    Gamma(2,,1)    random 
variables,   then  conditioning  on  G    (£,1),  we  have 


P(X<x)    =   P(G^-G2<x)    =   E^   {P(G^-G2<x|G2=g)} 

=  E„   {P(G.<x+g)} 
°2  ^ 


(III.B.I.il) 


Since  Gamma  random  variables   are  non-negative, 


P(G^<x+g)    = 


if     g  ^  -x, 


F^    (x+g)        if     g  >  X, 
^1 


(III. B. 1.5) 


where  F„    (x+g)    is    the    cumulative    distribution   function  of    G, .      The 

expressions    are   shortened   from   G.(J,,1)    to   G(2,,1),    because   they  are 
i.i.d.      Therefore,    (III. B.I. 4)    can   be  written   as 


P(X<x)    =     j       F^(x+g)f^(g)dy, 
g=L(x) 


(III.B. 1 .6) 


where 


f^igil)    = 


H-l 


exp(-g)        g   >  0, 
r(Jl) 


otherwise, 


(II. B.I. 7) 


is  the   density  of   a  Gamma    (il,1)    random   variable    and   again,    because   of 
the  non-negativity 
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L(x)    = 


0        if      X   >   0, 
-X      if    X   <    0. 


(III. B.I. 8) 


Differentiating  (III. B.I. 6)  using  Leibniz'  rule  for  the 
derivative  of  an  integral  with  variable  limits,  we  have,  after  some 
simplification 


gS«D 


g=L(u) 


(III. B. 1.9) 


Now  if  J,  is  a  positive  integer,  (III. B.I. 9)  can  be  evaluated 
analytically  using  integration  by  parts.  If  8,  =  1  we  obtain  the  density 
of  the  standard  Laplace  distribution.  For  i  =  2,3,^  the  densities  are 
also  well  known  derivations  given,  for  example,  as  textbook  problems  by 
Feller  [Ref.  25:  p.  64].  Feller  however  looks  at  the  results  of 
(III. B. 1.9)  as  the  n-fold  convolution  (n  =  2,3,4)  of  i.i.d.  standard 
Laplace  random  variables.  Figure  III. B. 1.1  shows  the  densities  of  the 
2, -Laplace  random  variable  for  8,  =  1,2,3.4.  Note  how  the  graphs  take  on 
the  shape  of   a  Normal   density  with   a^    =  2 J,. 

2.      Numerical   Evaluation  of    the    SL-Laplace   Density 

If  2,  >  0  and  is  not  an  integer,  then  (III. B. 1.9)  must  be 
evaluated  numerically.  We  will  be  interested  in  the  evaluation  of  the 
density  in  (III. B.I. 9)  for  0  <  2,  <  1  ,  in  order  to  calculate  conditional 
densities   and  likelihood  function. 
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Figure  III. B. 2.1  displays  examples  of  densities  for  non-integral 
i  obtained  by  using  the  IMSL  numerical  integration  scheme  DCADRE  to 
evaluate  (III. B. 1.9).  The  upper  limit  of  integration  in  (III. B.I. 9)  is 
replaced  by  a  suitable  constant  m  >  1.  Since  for  g  >  1  and  fixed  i  and 
u  >  0, 


then 


|DCADRE-f^(ua)|       <      ^^P^r^Ur^^    •  (III.B.2.2) 


Difficulty   in  integrating  comes   about   because  of   the  singularity 

at    the    lower    limit    of    integration.       If    2,    ^    1,    this    singularity 

l-S,  Jl-1 

disappears   by  rewriting    (l/(g(g  +  y))  as    (g(g  +  y))         .      For    i   <   ^,    there 

are  two  alternatives  for  removing  the  singularity.     We  can  transform   the 

variable  of    integration,   g,    to  become  t   =  g     and  the  singularity  at 

g    =   0    is    removed.      Or,    we    could    do    an    integration  by   parts   to  remove 

either  the  singularity  at   g  =  0  for   u  >    0    or    at    g    =    -u   for    u    <    0.       In 

either    case,    the   remaining    integral  must  be  evaluated  numerically  for 

u   5t   0. 

Since    X    is    a    symmetric    random    variable    we    can    rewrite 

(III. B.I. 9)   using  integration  by   parts   to  obtain  an   expression  that   will 

be  easier  to  apply.      For   all   u  "   0 
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f  f^^.0\      -  eXp(-|u|  )  ^r"  g   (2(g  +  |u|  )+1-i.}        ,      _   ,  .         ,^^^     Don' 

i^^"''^^   ~lpTl)  — '  2-1 —  exp(-2g)clg.    (III.B.2.3- 

_  o     (g+|u|) 


g=0 


If  S,  ^  .5  note  that  f^Cu)  is  not  defined  at  u  =  0.   For  J,  >  .  5  and 
u  =  0, 


??  -1 
f^(0;Jl)  =  r{2i-])/{T^{i)    2  }  <   «   .       (III. B. 2. 4) 


3.      The  Square  Root   Beta-Laplace  Transformation 

The  principal  result  of  this  section  is  the  proof  of  the  so- 
called  square  root  Beta-Laplace  transformation  theorem.  By  this 
technique,  an  Jl  -Laplace  random  variable  can  be  transformed  into  an 
ilp-Laplace  random  variable  where  8,  <  2,  .  The  time  series  models 
developed  in  Sections   III.C  -   III.F  rely  on  the  following: 

Theorem; 

Let    X    -    il -Laplace   and  B  -  Beta(a,  i-a)  ,   where  0  <   a   <   H  and  B   is 

1  /2 
defined  on  the   interval    [0,1],    i.e.   standard  Beta.       If   Y    =    B        X,    then 

Y   -  a-Laplace. 

Proof; 

By  conditioning  on  B,  we  obtain  the  following  expression  for  the 
characteristic  function  of  Y; 


1  /2 
4)    (oo)    =   E{exp(iB        Xcd)} 
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Eg[E{exp(ib^^^Xw)}] 


2  N  1  A<- 


E„[{l/(l+ba)^)}  ].  (III. B. 3.1) 


Since  bco^  >  0,  a  convergent  power  series  representation  of 
(III. B. 3.1)  is  given  by 


"   [k] 
^^(o))  =  E„{  I  ^^-^—(-0)^)^"},  (III. B. 3. 2) 

^       ^  k=0  "' 


where  again   l^^-^   =   «,( Jl+1  )  ,  .  .( il+k-1 )    for   k   =   1,2,...;    2,''°-'    =  1. 

Interchanging   the    expectation   and   summation    in    a    convergent 
power  series   gives 


"       [k] 

K^^^   =     I     HT?-  ^-^^^^   ^iB^)'  (III. B. 3. 3) 

^  k=0      '^• 


From  Johnson  and  Kotz   [Ref.    36:      v.    2,    p.    40],    we  have 


E(b'^)    =  a^^/l^^^        for   k   integer.  (III. B. 3. 4) 


Substituting   (III. B. 3. 4)    and   (III. B. 3. 3),   we  have 


(1)^(0))    =      I     ^^j^j— (-0)^)''    =    ^Jl^^""'  (III. B. 3.5) 

^^^         '  "^  Q.E.D. 
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C.      il-LAPLACE    FIRST-ORDER  AUTOREGRESSIVE   TIME  SERIES   MODEL 
1 .      Introduction 

In  this  section,  we  exploit  the  square  root  Beta-Laplace 
transform  to  define  a  2-parameter  first-order  autoregressi ve  model  in 
J,-Laplace  variables.  The  first  parameter,  i,  determines  the  non- 
Gaussian  symmetric  marginal  distribution  of  the  time  series  ensemble. 
The  second  parameter,  a,  given  the  value  of  i,  determines  uniquely  the 
lag-1  serial  correlation.  Since  the  model  is  shown  to  be  first-order 
Markovian,  a  determines  the  entire  autocorrelation  function  up  to  the 
sign.  We  show  also  that  the  models  are  always  partially  time  reversible 
with  respect   to  both  runs   probabilities   and  directional   moments. 

Writing  the  stationary  time  series  {X  (i)l  in  the  form  of  an 
additive  random   coefficient   equation,   we  have 


X    (J,)    =   A^^^(a,Jl-a)X      Ai)    +   B^ ''^C  i-a,  a)  L   ( Jl)  ,  (III. C. 1.1) 

n  n  n-i  n  n 


where   {X    (X,)}    is  assumed  to  be   a  stationary  time  series   with  a   marginal 

1  /2 
i-Laplace    distribution;     {A         (a,2,-a)}    is  an   i.i.d.   sequence  such  that 

n 

1  /2 
A    (a,2,-a)    is    a   standard    Beta;     {B         (J,-a,a)}    is    an    i.i.d.    sequence 
n  n 

1  /2 
independent   of    {A        {a,l~o.)}   such  that  B   (8,-a,a)    is  also  standard  Beta; 

and    {L    { I) }    is    an    i.i.d.    sequence,    independent    of    the    coefficient 
n 

processes,    such  that   L    (2,)    is   il-Laplace.      The   coefficient   A   (a,e,-a)    and 
^  n  n 

B   (2,-a,a)    are  assumed  to  be   independent    of    X      .,    X      _,    etc.       If    it    is 
n  n—  I         n  —  <i 

assumed   that    X      Ai)    has   a   2,-Laplace   distribution,   then   by  the  theorem 

n-1 

in   Section   III.B.3   so   does    X    (J,).       The    fact    that    the    process    is 
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Markovian    follows    by    construction.      To   start    the    process    in   the 

stationary  distribution  set  X_(Jl)    =  L  (S,). 

The  parameter  space  is  J,  >  0  and  0  ^  a  ^   J,. 

For  the  Beta  random   variables   A      and   B      (hence   their   square 

n  n  ^ 

roots)    to   be   properly  defined,   each  of   the  parameters  must  be   positive. 

Hence,  when  a  =  0  or  a  =  5,,    (III. C. 1.1)    as    defined  above    is   no    longer 

1/2  1/2 

appropriate   because      each   of    A  and   B  has    a   parameter    that   is 

^  n  n 

1  /2 
identically  zero.      If  a  =  0,    it   is  understood  that    the    {A        }    sequence 

1  /2 
is    identically    zero    and    the    {B         }    sequence    is   one;    therefore, 

(III. C. 1.1)   becomes   X    (I)    =    L    (£)    and   the    {X    }    sequence    is    the    {L    } 

n  n  n  n 

1  /2 
corresponding  to  the   i.i.d.    8,-Laplace   case.      For   a  =   £,    A  is  one  and 

1  /2 
B  is    identically    zero;     therefore,     if    X.    =    L^(2,),    then    X       is 

n  J  »  '00  n 

Jl -Laplace,   but   is  not   an   ergodic  process. 
If  we  let 


e      =   B^^^(Jl-a,a)L    (il)  (III. C.I. 2) 

n         n  n 


then    by    the   Theorem    in   Section   III.B.3.     e       ~    (  !l-a) -Lapl  ace    with 

E(e    )    =0    and   Var(e    )    =    2(Jl-a)    for   all   n.      Since  the   variance  must  be 
n  n 

non-negative,    it  is  also  necessary  that   a  ^   ?,.      By    the    stationarity   of 

1  /2 
{X     }     and    since    A  (a,il-a)X       ,  (  £ )     is    independent    of    c     ,    the 

n  n  n-  1  n 

characteristic  function  of   the  right-hand  side  of    (III. C. 1.1)    gives 


♦x(u(">  ■  It^V  Ir^V"  ■  ?i^r-  (iii.c.1.3) 


128 


Examples  of  sample  path  behavior  for  selected  I  and  a  are  given  in 
Figure  III. C. 1.1.  Note  that  although  the  correlation  coefficient  is 
approximately  0.8  for  all  sets  of  I  and  a  in  Figure  III. C. 1.1,  there  is 
considerable  difference  in  the  sample  path  behavior  as  I  changes.  For 
the  samples  from  small  values  of  iK.IO  and  .05),  there  are  runs  of 
values  that  are  very  nearly  zero  in  magnitude. 
2.     Correlation  Structure 

Using    equation    (III. C. 1.1)    recursively    along    with    the 
stationarity  and  independence  of   the  process    {X   },   we  have 


E{X    {i)X  _    ii)} 
p(1)    =   Corr(X^(^),X^_^(0    =        Etx-      (,)  | 

n-1 


E[{Ay^(a,Jl-a)X^_^  ^^^"^n^^n-l  ^^^^ 

E{X2    Ai)} 
n-1 


E{A'''^(a,l!,-o))E{X^_,(t))  ,  ,, 

EIX-    Ai)}       -    EU^^2(^_^.^,,_         (Ill.C.S.n 

n-  I 


From   Johnson   and   Kotz    [Ref.    36:      v.    2,    p.    40],    we   have    for 

A      ~  Beta(a,  il-a)  ,   that 
n 


EipT^iaA-a))    =    ;:|^^;:^j:|^j  for   all    n.    r    >  0.  (III. C. 2. 2) 


where   r(.)    is   the   incomplete  Gamma  function.      Therefore 
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,    s    ^    r(a+1/2)r(il)    _   ar(a-^1/2)r(ll>1)  f^^^   r    :>    -i^ 

P\'^      r(]i+i/2)r(a)      s,r(j,^i/2)r(a+i)-  viii.u.^o; 


Note    that    as    a   ^'    2,,    then    p(l)    ■*    1.       Similarly   as    a   -►    0,     p(  1  )    ■*    0. 

Therefore,    we  obtain   a  full    range   of    positive    correlations    in    a 

one-to-one  function  of   a  for   any  given  value  of   I. 

Also  from    (III. C. 1.1),    we   see   that    the    process    is    explicitly 

autor egr essi ve .      It  is  also  autoregressive   in  the  sense  of   expectations 

in   that    E(X    (Sl)|x      Al)     =    x)    is    a    linear    function    of    x.       Since 
n         '    n- 1 

Ikl 
(III. C. 1.1)    defines   a  first-order  Markovian  process,    p(k)    =   p(1)'     '    for 

all   k.      To  see  this  we  write  for   all   k 


,.  .  n  n-k 

P^^^  =  Ti 


1    /9  Ef^n      l^'^^^n      L^^'^)^ 


=   p(1)p(k-1) 


=   p(1)p(k-2)p(1) 


=    {pd)}''''. 


If  we  replace  A^^^(a,2--a)    in    (III. C. 1.1)    by   -A^      (a,il-a)    we  have 


._  r(a-H/2)r(il)    _    _  ar(a+1  /2)r(il^1)  ,^„   r    ■:>  ^\ 

P^^^   =  "  r(2,^i/2)r(a)  =  "  ini^\/2)Y{o.^\)    '  uii.L.^.5; 
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We  can,   therefore,   achieve  a  full  range   of    negative    correlations,    and 
likewise 


p(k)   =   (-1)  ■'^'{pd)} ''^1  for  all   k.  (III. C. 2. 6) 


3.   Partial  Time  Reversibility 

The   J, -Laplace  first-order  autoregressi ve   models    are   partially 

time    reversible,    both    with    respect    to    the    directional    moments, 

{X^(2,)X         (l)}    for   m   =    0,     ±1,     ±2,...,    and    with    respect    to    runs 
n  n-m 

probabilities,    P{X^(5,)<X  _^ii)}    =   P{X    (il)>X  _^(J!.)}. 

Using  mathematical   induction,   stationarity   of    {X    (£)},    and   the 

independence  of   the  coefficients   and  the   innovation  from  each  other   and 

previous    values    of    {X    (£)},    it    is    the    case    that     { X  M  «, )  X  (2,)} 

^  n  n  n-m 

=    E{X    (il)X2       (i)}    =   0    for    all    n   and   for    all    m   =    0,    1,    2 Let 

n  n-m 

X  -   il-Laplace.      For  m  =   0,    E(X^)    =  0  by    (III. B.  1.2).      Assuming  for  m   =   k 

that    E(X^X       ,  )    =0,    we    have    for   m    =    k+1    after    substituting   from 
n    n-k  ^ 

(III. C. 1.1)    and    (III. C. 1.2)    that 


E{X^X       ,,  ^,,}    =   E{(A  X=^      +2A^^^X      ,e+G^)X       ,,     ,,} 
n  n-(k  +  l)  n  n-1        n       n-1    n     n     n-(k+l) 

=   E(A   )E{X2    .X      ,,     ,  ,} 
n  n-1    n-(k  +  1  ) 


=   E(A   )E(X^X   _    )    =   0.  (III. C. 3.1) 

11  1 1    n    K. 


Assuming    for    m    =    k    that    E(X   X^    ,  )    =   0,    we   have    for    m    =    k  +  1    after 

n   n-k 

substituting  again  from    (III. C. 1.1)   and   (III. C.I. 2) 
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E{X  X^    .,     ,,}    =   E{X^    ,,     ,,(A^^^X        +e    )} 
n  n-(k+1)  n-(k  +  1)      n       n-1      n 

=   E(A^^^)E{X2    -,     ,,X      ,} 
n  n-(k+l  )   n-1 


=   E(A^^^)E(X2    ,X    )    =  0. 
n  n-k  n 


(III. C. 3. 2) 


To  see  that  this  model    is  also   partially   time   reversible   with 

respect    to  runs   probabilities,   we  show  that   the  random   variable  Z     =   X 
^  n  n 

-    X       ,     is    symmetric.       Now    Z       is    symmetric    if    and    only    if    the 
n- 1  -^  n  -^  ^ 

characteristic  function  of  Z      is  real   valued.     We  write 

n 


^^{i^)    =  E[exp{iw(X^  -  X^_^)}] 


=  E[exp[ia){e  -(l-A^^^)X   .}]] 
n     n    n-i 


E{exp(iwe    )  }E[exp{ -ia)(  1 -A^  ^^)X      ,}] 
n  n  n-i 


-^Y'""   E.[E[exp{-iu)(l-a^^^)X^    .}]] 
+0)    J  A*-  n-i 


1     U-a 


1+w' 


^      !^  1/2,2     2 

1  +( 1-a         )    00 


(III. C. 3. 3) 


Since    (III. C. 3.3)    is  real    valued  that   concludes   the   proof 
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D.      THE   BETA -LAPLACE   AUTOREGRESSIVE  MODEL,    BELAR(I) 
1 .      Introduction 

In  this  section,   we  set   Jl   =   1   in    (III. C. 1.1)    and   (III. C.I. 2)    to 
obtain  the  following  expression  for   the   BELAR(I)    process 


X      =   A^'^^(a,1-a)X      ,    +    e    ,  (III. D. 1.1) 

n         n  n-i  n 


where    {e    }    is    an    i.i.d.    sequence  with  e      -   ( 1-a)-Laplace  with  moments 
n  n  ^ 

and  density  given  by    (III. B.I. 2)    and   (III. B. 2. 3).      X     now  has   a  standard 

n 

Laplace   marginal    distribution.      The    only   parameter   in  the  model    is  a 

with  0  ^  a  ^   1.      All  the  results   of  Section  III.C   still   hold  with   i  =   }. 

Examples   of   sample  path  behavior   are  given  in  Figure  I I I. D. 1.1. 

We    do    two   things    in    this    section.       First,    we    derive    the 

equations    for    the    conditional    density   of   X    Ix      ,.      The  second  is  the 
^  -^  n '    n-1 

derivation   of    joint    density   and   the    logarithm    of    the    likelihood 

function.      The    expression    is    used    in   Section    III.E.6    to    obtain  the 

maximum  likelihood  estimate  for   a. 

2.      The  Conditional   Density 

To  find  the   conditional    density   of    X    IX      , ,    we   will    need   the 

^  n  '    n-1  ' 

1  /2 
density   of   A        (a,l-a).      Let   A     be   a  standard  Beta  random  variable  with 
•^  n  n 

parameters    (a,1-a).      Since  A      is    defined   only   on   the    given    interval, 
zero  to  one 
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n 


P(A  <a^) 

n 


0  <  a  <  1 
otherwise 


x=a' 


[   f^  (x;a)clx,    0  <  a  <  1,     (III. D. 2.1) 


x=0 


where  f   (x;a)  is  the  standard  Beta(a,1-a)  density  given  by 
n 


f^  (x;a)  = 
n 


a"^  ^1-a)°'/r(a)r(1-a)   0  <  a  <  1, 


otherwise.   (III. D. 2. 2) 


Differentiating   (III. D. 2.1)   with  respect   to  a,   we  obtain   the    following 
expression  for 


^  ra      1    =  2  a 

i^^^^Ka;a)        r(a)r(1-a) 

n 


2a-1 


( 1-a    ) 


0  <   a  <    1  . 


(III. D. 2. 3) 


Examples   of    (III. D. 2. 3)    are  given  in  Figure  III.D.2.1. 

Now  we   evaluate  P(X^   <    x|X   _,=y)    using    (III. D. 1.1),    (III. B.I. 2), 

1  /2 
and   (III. B. 2. 3).     Conditioning  on  A        (a, 1-a)    we  obtain 
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1/2 


P(X„<x|X^_^=y)    =   P{A^      (a,1-a)X^_^    .    e^  <    x|X^_^=y} 


=   E  T/2tPte^   <    X  -   yAy^(a,1-a)|Ay^=a}] 

A. 
n 


A. 
n 


■  ^.1/2'^  '^-^y" 

A  n 

n 


a=L2(x) 


=   I  F^    (x-ay)    f    ^/2^^5a)    da    ,  (III. D. 2. 4) 


a=L^(x) 


where  from    (III.B.2.3)    the   cumulative   distribution  function  of   e      can  be 

n 


written  as 


F      (x-ay)    = 
e 
n 


u=x-ay 


+         [        f      (u;l-a)du  if    x-ay   >   0, 


u=0 


(III. D. 2. 5) 


u=ay-x 


I"        f      (u;1-a)du  if    x-ay  <    0, 


u=0 


and  L.(x),    i    =    1,2   are   the    limits    of    integration  on   a   which   may    be 
functions   of   x. 
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Since  F     (x-ay)   changes    definition   for    negative    and   positive 
n 

(x-ay)    and   since    0   <    a   <    1  ,    we  rewrite   (III.D.2.^)   based  on  the  ratio 
x/y,  which  is  a  constant.     Thus 


P(X  <x|X       =y)    = 
n      '    n-1    •' 


a=1 

f     F      (x-ay)f       .    (a;a)da     if   x/y  ^    1    or   x/y   ^   0; 

a=0  n 


(III. D. 2. 6) 


a=x/y 

j"  F^   (x-ay)f    ^^2^^'°'^^^ 

•'^  ^n  A 

a=0  n 


a=1 


+    f       F      (x-ay)f    ,  ,^(a;a)da     if     0  <    x/y  <    1 


.        n  A 

a=x/y  n 


Differentiating  (III.D.2.4)  with  respect  to  x  using  Leibniz' 
rule  gives  the  following  general  expression  for  the  conditional  density. 
We  have 


n '    n-1 


a=L2(x) 


-I 


f      (x-ay;l-a)f    ^^^^^JO')^^ 

,     .    ^  n  A 

a=L    (x )  n 


+    F      {x-yL„(x)}f    .   .„{L-(x);a}   ^L^(x 
e  2  .1/22  dx   2 
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-F      {x-yL,(x)f    ,  ._{L,(x);a}^L,(x).        (III. D. 2. 7) 
e  1  ,1/2.1  dx   1 

n  .  A 

n 


From    (III. B. 2. 3),    (III. D. 2. 3)   and   (III. D. 2.5)    set 


h(g,a) 


_  2a^"   ^exp{-(2g+|x-ay|  )}   g  "°^(2g+2|  x-ay  | +a) 
r^(l-a)    r(a)    (1-a^)°'   (g  +  |  x-ay  |  )^  ■'°'    (1-a) 


(III. D. 2. 8) 


Now  using    (III. D. 2. 7)    to  differentiate  each  expression   in    (III. D. 2. 6), 
we  have  the  following  explicit  expressions  for 


a  =  1    g=<» 

(        (     h(g,a)dgda     if     x/y  >   1      or     x/y  <  0   , 

a=0  g=0 


f^  |x      (=1)"  ■  I 

n'    n-1 


(III. D. 2. 9) 


a=x/y  g=<» 

f  ["       h(g,a)dgda 

a=0        g=0 


a=1      g=«» 
+    [  f       h(g,a)dgda     if     0  <   x/y  <    1    . 

a=x/y  g=0 


It    will    be    seen    later    that    working    with    (III.D.2.9)     will     be 
inconvenient.      Hence,   we  rewrite   (III.D.2.9)    as 
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Ty.    |,       (x|y)   . 

n'    n-1 


a=1 

[      f^    {(x-ay);1-a}f    ^^^U;<x)da      ^^   ""^^  "    ^ 

"^^        n  A  or   x/y   ^  0 

a=0  n 


a=x/y 

f  f^    {(x-ay);1-a}f    ^^^ia;oi)cia 

A  n  A 

a=0  n 


a=1 


(III. D. 2. 10) 


+    [        f^    {(x-ay);1-a}f    ^/2(a"'C')da   if    0   <    x/y   <    1. 


a=x/y 


The  conditional  density  in  (III. D. 2. 10)  can  assume  different 
shapes  as  a  function  of  x  depending  on  the  fixed  conditioning  value,  y, 
and  the  particular,  fixed  a.  If  a  =  0,  then  (III. D. 2.  10)  becomes  the 
standard  Laplace  density  as  given  in  (II. B. 1.1)  with  p  =  0  and  X  =  1. 
If  y  =  0,  then  (III. D. 2. 10)  becomes  the  ( 1-a)-Laplace  density  as  given 
in  (III. B. 2. 3)  with  I  =  1-a.  In  Figure  III.D.2.2  are  presented  different 
examples  of  (III. D. 2. 10)  for  a  fixed  y  and  different  values  of  a.  Note 
that  if  a  <  1/2  then  (III. D. 2. 10)  is  continuous  for  all  x.  If  a  ^  1/2 
and  X   =  y,    (III. D. 2. 10)    is  undefined,   e.g.,    x   =  y   =  0. 

In   a   similar    manner,    expressions   for    ( III.D.  2.  4)- ( III.D.  2. 10) 

can    be    derived    for    the    BELAR( 1 )    model    with   negative    correlations. 

1  /2 
Placing    -A  (a, 1-a)    in    (III. D. 1.1),    we   replace    x-ay    by    x+ay   and 

determine  the  appropriate  form  of   the   conditional    density    based   on   the 

ratio    (-x/y).     We  have  for   the  negative   BELAR(I)    process 
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^x  Ix    /■'I"'  = 

n'    n-1 


a=1 

f      f^    {(x+ay);1-a}f    ^^^{a;(x)6a 

a=0  n 

if  -x/y  >   1    or   -x/y  >   0, 


(III. D. 2. 11) 


a=-x/y 

f  f^   {(x+ay),1-a}f    ^^^^^J^'^^^ 

a=0  n 


+      f  f^   {(x+ay);l-a}f    ^^^{a;a)cla 

a=  -x/y  n 

if   0  <  -x/y  <    1  . 


3.      The  Joint   Distribution  and  the   Likelihood  Function 

An   expression  for  the  joint   density  of  X    ,...,X      can   be  written 

using  f      I  (x    |x  _    )   and  f      (x    )   as  follows: 

n'   n-1  1 


n-1 

^X    ...x/^n ^1^    "   ^x/^1^    /,    ^X      ,,     ,JX      /^n-(k-l)  l^n-k^* 

n  1  1  k  =  1        n-(k-l)  '    n-k 


(III. D. 3.1) 


The  log-likelihood  function  as   a  function   of    a   given    {X    }    is   just    the 


natural   logarithm  of    (III. D. 3.1).      We  have 


n-1 


L(a)    =   -(in  2   +    |x    |)    +      I     Inff^  j^        (x^    ,.     .Jx^    .)}. 

'     I'  ,1  X      ,.     ,^X      ,       n-(k-1)n-k 

k=1  n-(k-1 )      n-k  ' 


(III. D. 3. 2) 
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It    is    now    a    simple    matter    to    determine    which    branch   of 
(III. D. 2. 10)   or    (III. D. 2. 11)    is  needed  for    each    pair    (x    ,x    _    )    and   to 
substitute    it    into    the    sum    in    (III. D. 3. 2).      We    postpone   further 
discussion  of   the  likelihood  function  until  Section  III.E.6. 
M.      Numerical  Evaluation  of   the  Conditional  Density 

a.      Introduction 

This  section  is  devoted  to  explaining  the  methodology  by 
which  we  came  to  resolve  the  problems  in  the  numerical  integration  of 
the  conditional  density.  This  is  as  important  an  issue  as  the 
derivation  itself,  since  the  likelihood  function  and  the  maximum 
likelihood  estimators  can  not  be  evaluated  without  it.  As  is  pointed 
out  below,  the  standard  numerical  routines  were  unsuccessful  in 
accurately  evaluating  (III. D. 2. 9)  around  the  singularities  in 
(III. D. 2. 8).  We  also  give  and  justify  the  approximations  that  were  used 
to  remove  each  of  the  singularities.  The  graphs  in  Figure  III.D.2.2 
were  obtained  using  the  method.  The  methodology  was  used  again  in 
Section  III.E.6  to  evaluate  the  log-likelihood  function  in  the  method  of 
maximum  likelihood  estimation. 

In  the  FORTRAN  routine  that  calculates  the  conditional 
density  as  given  in  (III.D. 2. 1 0) ,  the  approximations  in  (III. D. 4. 6), 
(III. D. 4. 8)  and  (111.0.4.11)  are  added  to  the  results  from  DCADRE. 
Combinations  of  these  approximations  are  invoked  as  necessary  depending 
on  the  ratio  x/y. 

The  same  procedure  is  used  to  evaluate  the  density  in 
(III.D. 2.11)    for   the   BELAR( 1 )   model   which  produces   negative   correlations 
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for  odd  lags.  We  just  check  for  0  <  -x/y  <  1  and  choose  the  appropriate 
value  of  c  in  (III.D.M.6)  and  (III. D. 4.8)  where  x-ay  is  replaced  by 
x+ay. 

b.      The  Methodology 

Attempts  to  evalute  the  conditional  density,  as  given  by 
(III. D. 2.8)  and  (III. D. 2. 9),  using  the  standard  IMSL  double  integration 
routines  failed.  Even  the  IMSL  routine  DBLIN  which  is  often  successful 
in  handling  ill-behaved  integrands,  was  unable  to  evaluate  (III. D. 2.8) 
around  the  singularities.  For  a  <  1/2,  along  the  lines  a  =  0  and  a  =  1, 
(  III. D.  2.8)  is  unbounded.  Similarly  for  a  ^  1/2,  along  the  line  a  =  1 
and  at  the  point  (g,a)  =  (0,x/y)  for  0  <  x/y  <  1,  (III. D. 2. 8)  is 
unbounded.  Arbitrarily  declaring  (III. D. 2. 8)  to  be  zero  under  these 
conditions  did  not  always  allow  DBLIN  to  accurately  evaluate 
(III. D. 2. 9)  . 

We   succeeded    in    evaluating   the    conditional    density    by 

working  with   the    form    given    by    (III. D. 2. 10)   with  f  (a;a)    given   by 

A 
n 

(III.D.2.3)    and  f      {(x-ay);1-a}    given   by    (III. B. 2. 3).      We   used    the    IMSL 
e 
n 

routine  DCADRE  to  construct  an  extensive  table  of  values  for  the  (1-a)- 
Laplace  density  with  the  intention  to  linearly  interpolate  from  the 
table  as  needed.  The  error  in  the  value  of  f  (|u|;1-a)  in  the  table  is 
controlled  by  DCADRE.  The  error  in  the  value  of  f  (lu^|;1-a)  obtained 
by  using  linear  interpolation  for  [u  |  not  in  the  table  is  calculated  in 
the  standard  way.      From  Gerald   [Ref .    28:      p.    168] 
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I  Error   Interpolation!    = 


h^sCs-l) 


d^f      (c;l-a) 


du' 


(III. D. 4.1) 


where  h  is  subinterval  length  and  s  =  (u^-u)/h.  Substituting  the  second 
divided  difference  into  (III. D. 4.1),  in  place  of  the  unknown  second 
derivative  and  also  noting  that  the  worst  case  for  linear  interpolation 
is  at   the   center   of   the  subinterval,   we  have 


I  Error  Interpolation!    < 


-1  A^f      (|u!;1-a) 
n 


(III. D. 4.2) 


where    A   f         is    the    second    difference.      Because   f       (   u    ;1-a)    is  non- 
n  n 

negative  and  monotone  decreasing  in  |u!,  the  largest  values  of  A^f  are 
in  subintervals  close  to  zero.  The  table  that  was  constructed, 
therefore,  uses  smaller  subintervals  close  to  zero  and  larger 
subintervals  further   out. 

Finally  we  used  DCADRE  again  to  evaluate  (III. D. 2. 10)  except 
near  the  singularities,  which  we  were  able  to  evaluate  analytically  and 
then  add  back.  The  technique  is  often  referred  to  as  "removing  the 
singularity" . 

c.      Removing  the  Singularities   Due  to    (III. D. 2. 3) 

We    now     describe    how    we    evaluated    the    integrals    in 

(III. D. 2. 10)    in  the   vicinity  of   the  singularities    in    (III. D. 2. 3).      We 

1  /2 
see  that  the   density  of  A        (a,1-a)    given   in    (III.D.2.3)    is  undefined  at 

a  =  0  and  a   =   1    for   a  <   1 /2  and  at   a  =  1    for   a  ^   1/2.      We  also  note  from 

(III.D.2.3)    that  for  small   6  >   0  and  a  <   1/2 
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f,l/2<^  =  ">      :     r(a)r(l-a)    •     0<^<   i:  (IH.D.^.3) 

n 


and  for   all   0  <   a   <   1 


f  (a;a)      -     — ,      1-6    <  a  <    1    .  (III. D. 4. 4) 

k  ~     r(a)r(1-a)(1-a^)°' 

n 


Therefore  for  a  <   1 /2  and   1    ^  x/y  or   x/y  <  0  we  have  from    (III.D.i4.3) 


I     f   ,/2(a;a)f^  ((x-ay);l-a)da  ;     J    ^,^-||A_  f^   ,  (x-ay);1-alda. 
n     k  n  '  n 

^=°        "  ^=°  (III.D.4.5) 


Since  f   (•)  IS  continuous  in  this  situation,  there  exists  a  number  c  so 
e 

n 

that     0  <   c  <   6  and    |x|    ^    |x-cy|    ^    |x-6y|   and 


a=6      2   2a-1  a=6  2a-1 

/      r(a)r(1-a)^    {(x-ay);1-a}da   =   f^    {(x-cy);1-a}    |      ^(aKd-a)    ^^ 
a=0  "  "  aio 


2 

^      r /  s    1       ,       (1 -a) 5 

=    f      l(x-cy);l-a} 


e    '^        '"      "'    r(2-a)r(1+a)    * 
n 

(III. D. 4.6) 

A    natural    approximation    for    c    allows     jx-cyj    to    be    the    average, 
(1/2) |2x-6y|  . 

For    all   a  and    1    <    x/y  or    x/y   <    0  we  have   from    (III.D.4.4) 
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a=1 

f  f   ^^^ia;(i)f^   {(x-ay);1-a}da 

.     r     A  n 

a=1-6        n 


a=1  ,  p 

=      f  T.^-^w■■_-^  ,.   f^   {(x-ay);1-a}da    .    (III. D. 4. 7) 


r(a)r(1-a)    r.      2^ct     e 


(^-'J' 


Likewise  there  exists   a  new  number   c  so  that    1-6   <  c  <    1    and 
x-yl    <    |x-cy|    <    |x-y+6y|    and 


a=1  ^  2a 

i  r(    \^(^ T  ^     { ( x-ay ) ;  1  "tt }  da 

I  r(a)r(1-a)    (,      ,\(i     e 


('-')' 


a=1  ,  „ 

=    f      {(x-cy);l-a}       f  ^.    , ,'  r  [ ^^—]^a 

J  r(a)r(1-a)  (i.^.^a^ 

a=l  ~o 


-  ^.  "--°y)"-°'  r(uf)r!2-.)   •  (III.D...8) 


Again  a  natural   approximation   for    c   allows    |x-cy|    to   be    the    average, 

(l/2)|2(x-y)+6y|. 

d.   Removing  the  Singularity  Due  to  (III.B.2.3) 

The  final  type  of  singularity  occurs  when  0  <  a  =  x/y  <  1 

and  a  Z    1/2.   When  this  situation  occurs  we  leave  f   (•)  under  the 

n 

integral     and    argue    that    in    a    6 -nei  ghbor  hood    around    x/y    <     1, 

f       .    (a;a)    ==  f   .  .p(x/y;a).      Note  that   by  the  same  argument   that    gave    us 

A  A 

n  n 
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(III. D. 4.6)    and    (III. D. 4. 8),   there  exist   two  numbers   c     and  c     so  that 
(x/y)-6   ^  c^  x/y  and  x/y  ^  Cp  ^    (x/y)+6  and 


a=x/y 

I  f  T/2(a;«)fe   {(x-ay);l-a}da 

a=(x/y)-6     n 


a=(x/y)+6 

+      I  f   ^/2^^5a)f^   {(x-ay);1-a}da 

"^     ,  A  n 

a=x/y  n 


a=x/y 
=   f   ^/2^^'['^^'>    f  ^e   {(x-ay);1-a}da 

n  a=(x/y)-6 


a=(x/y)+6 
+    f    1/2^^2"^^    f  ^      {(x-ay);1-a]da.  (III. D. 4.9) 

n  a=x/y 


We  chose  to  approximate  c  and  c_  both  by  x/y  for  x/y  *  ±  1  ,  and  have 

f   .p(x/y;a)  <  "  for  all  a.   If  x/y  =  1  or  x  =  0  and  y  =  0  simultane- 

A 
n 

ously,   the   value  of    (III.D.2.10)    is  undefined  for   a  ^   1/2. 

Now  changing  the   variable  of    integration  so  that    (x-ay)    =  u, 
we  have  from    (III. D. 4.9)    that   for   all   a  ^    1 /2 


a=x/y  a=(x/y)+5 

[  f^    {(x-ay);l-a}da   =      f  f^   { (x-ay) ;a}da , 

a=(x/y)-6  a=x/y 
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1        u=|y6| 
u=0 


^    (i)    TTT    .  (III.D.4.10) 


2^  ITT  ' 


because   f      (•)    is   a   symmetric  density.      That    is    (III,D.4.10)    is   an 
n 

expression  for  -i — r  P(0  <  e  <  |y6|)  where  e  is  the  ( 1 -a) -Laplace 
innovation  random  variable.  Therefore,  we  add  back  to  the  DCADRE  result 
the  amount 


[jfr-jf   ^/2(x/y;a){P(0  <   e^   <    |y6|)}    <    [j^  f   ^^^{x/y;a)    <   »   ,    y   *   0. 
'^  '      A  '^  '         A 


1/2 

f  r  u/y;a;    <. 

(III. D. 4. 11) 


We    choose    the    following    combination   as    the    value    for 
P(0  <   e     <    |y6|  ). 

i)     Using  the  trapezoidal   rule  and  the   table  of   values   for 
the    ( 1-a)-Laplace   density  we  found 


u=M 

P^(0   <    e^   <    |y6|)    =   1 /2   -      [  f^    (u;1-a)du    .  (III. D. 4.12; 

I.I         n 
u=  y6 


Equation   (III.D.4.12)    is  the   average  of   the   upper   and  lower   Riemann  sums 

of   the  tail   of   the    density   subtracted   from    1/2.      Using    (III.D.4.12) 

instead    of    directly    integrating   f       (u;l-a)    from    zero   to    |y6|     is 

n 

preferrable,   because  for   a  S    1/2,   f      (0;1-a)    is  undefined.      The  error   in 

e 
n 
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(III. D. 4. 12)    from    using    the    trapezoidal    rule    approximation   is 


approximately 


h 


i 


,,   A^f      (i) 
1 2  G 

n 


in  the   i        subinterval.      Even  though    there 


are  over    MOO  subintervals,   the  second  differences   A^f    (i)  are  very  much 
smaller  for   a  ^   1/2   in  the   interval    [|y6l,M]. 

ii)     A  second  measure  of  P(0<e  <ly6l)   is  the  lower  sum 


P-(0  <   e„   <    |y6|  )    =    |y6|f      {(y6);1-a}, 

^  nil  I  '       c* 


(III.D.4.13) 


since  P(0  <   e     <    |y6l  )   is  always  at   least  as   large   as    (III. D. 4. 13).      Our 

approximation  for   P(0  <   e     <    |y6|  )    is  the   maximum    of    (III.D.4.12)    and 

(III.D.  4. 1  3)  .     We  use  the  maximum   because  P.    given   by    (III. D. 4. 12)    could 

be  negative  when    |y6|    is  close  to  zero.      This  follows   because  F      (u;1-a) 

n 

is    strictly   decreasing   for    u    >   0,    and  thus   the  trapezoidal   rule  over- 
estimates  the   integral    in    (III.D.  4. 1  2)  . 
E.       PARAMETER   ESTimTION    IN   THE   BELAR(  1  )    PROCESS 
1 .      Introduction 

In  this  section,  we  develop  estimators  for  the  parameters  in  the 
BELAR(I)  process  and  report  results  on  properties  of  these  estimators 
obtained  from  analytical  comparisons  and  simulations.  We  examine 
estimators  for   the   location   parameter,    y,    and   the    scale    paramter  ,    X, 

of    the    series     {X    };     the    parameter,    a,    of    the    random    coefficient 

1  /2 
A        (a,1-a);    and  Y,    the   lag-1    serial    correlation,    which    is    a   monotone 

function  of   a. 
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The  theory  of  conditional  least  squares  estimation  for  the 
BELAR(I)  process  using  the  linearized  residual  is  derived  using  results 
from  Nicholls  and  Quinn  [Ref .  16].  We  give  a  corollary  to  their  Theorem 
3.1   pertaining  to  the  strong  convergence  and  asymptotic  Normality  of   the 

least  squares   estimator   of   Y,    the  lag-1    serial   correlation.      An  estimate 

1  /2 
for    a   is    derived  using  the  fact   that   Y  =  E{A        (a,1-a)}.      Also,   we  show 

that  the  joint  least  squares  estimator  of  location   and   correlation  for 

the   BELAR(I)    process   is  the  same  as   for   the   linear  AR(1)    processes. 

Other  estimators  of  lag-1  serial  correlation  in  the  BELAR(  1  ) 
process  are  derived  using  the  ideas  of  robust  estimation  of  Huber 
[Ref.  37]  and  least  absolute  deviation  (LAD)  estimation  as  applied  to 
ordinary  linear  autoregressive  models  by  Denby  and  Martin  [Ref.  38]  and 
Bloomfield  and  Steiger  [Ref.  39].  Although  these  estimators  are 
consistent  and  asymptotically  unbiased  in  linear  models,  for  the  random 
coefficient  models  the  results  of  the  simulation  study  show  that  they 
have   a  bias   that   does   not   go  to  zero  asymptotically. 

The  maximum  likelihood  estimator  of  a,  ct^  _ ,  is  found  using  an 
iterative  technique  with  the  initial  estimate  being  derived  from  the 
least  squares   estimate  of   serial    correlation,    T      . 

Many  of  the  simulations  comparing  the  different  estimators  are 
conducted  within  the  framework  of  SIMTBED  [Ref.  15].  From  the  Summary 
Statistics  table  generated  by  SIMTBED  for  each  estimator,  it  is  possible 
to  draw  conclusions  concerning  the  bias,  the  variance  at  different 
subsample  sizes,  the  asymptotic  variance,  and  how  fast  the  estimator 
approaches  asymptotic  Normality.  In  the  SIMTBED  program,  one  can 
specify   the    total    number    of    samples    examined  at   each  subsample  size. 
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The  total  number  of  samples  used  is  the  product  of  three  parameters,  N, 
M,  and  NSR.  Three  combinations  of  these  parameters  were  used.  Table 
III. E.  1.1  is  a  summary  of  the  number  and  types  of  subsample  sizes,  N, 
and  the  number  of  independent  repetitions,  M,  of  each  type  of  simulation 
conducted  using  SIMTBED. 

TABLE    III.  E.  1.1 
Summary  of  SIMTBED  Types 


Number   of 
Super 
Replications 
Type  25  50  75  100  125  175        250        500  (NSR) 


Subsample  Sizes    (N) 


I  2000        1000        660  500  400  280        200        100  5 

II  4000        2000      1330  1000  800  570        400        200  10 

III  8000        4000     2660  2000        1 600        1140        800        400  10 


Each  entry  in  a  Summary  Statistics  table,  which  is  the  output  of 
SIMTBED  after  super  replication,  is  a  pair  corresponding  to  a  mean 
(average  over  the  number  of  super  replications,  5  or  10)  and  an 
estimated  standard  deviation  of  that  mean  value.  From  Table  III. E. 1.1, 
it  is  clear  that  a  large  number  of  independent  realizations  was  used  in 
the  computation  for  each  super  replication  and  the  different  subsample 
sizes.  Because  of  this,  subsequent  tests  of  hypothesis  that  we  use  on 
the  simulation  outputs  will  be  t-tests  on  the  mean  of  a  random  sample  of 
size  5  or  10  drawn  from  a  Normal  population  where  o^  is  unknown,  but  is 
estimated  from   the  sample. 

Before  describing  each  estimator  and  simulation  experiment,  it 
is  convenient  now  to  summarize  the  conclusions  of  this  investigation 
into  the  estimation  of   parameters   in  the   BELAR( 1 )    process: 
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a.  The  simulation  results  from  SIMTBED  indicate  that  both  the  sample 
median  and  sample  mean  are  asymptotically  Normal  estimators  of  y. 
The  asymptotic  variance  of  the  sample  mean  is  approximately  twice 
that  of  the  sample  median  across  all  values  of  the  correlation 
coefficient,    Y. 

b.  The  simulation  results  from  SIMTBED  also  indicate  that  the  mean 
absolute  deviation,  given  in  (II.E.3.2),  is  an  unbiased  and 
asymptotically  Normal  estimator  of  the  scale  parameter,  X.  It 
also  has  the  smallest  asymptotic  variance  of  the  three  estimators 
considered. 

c.  The  least  squares  estimator  of  Y,  the  lag-1  serial  correlation  is 
asymptotically  unbiased  and  Normally  distributed.  Simulation 
results  support   this  conclusion. 

d.  Simulation   of    other    estimators   of   lag-1    serial    correlation  based 

on  non-linear  residuals  of   the  form    R      =    X   -YX      ,    +    Bf(X    ,X      ,) 

n  n        n-1  n      n-1 

indicates    that    the    value   of    (Y,6)    that    maximizes    the    sum    of 

squares  of  R      is  approximately   (Y     ,0). 

n  Lio 

e.  Robust  estimators  of  serial  correlation  based  on  certain  symmetric 
loss  functions  of  the  linear  residual  (other  than  the  sum  of 
squares)  are  biased  and,  apparently,  asymptotically  biased. 
SIMTBED  outputs  of  the  Huber(c),  rank  and  LAD  estimators  of  lag-1 
serial   correlation  clearly  exhibited  this  result. 

f.  The  maximum  likelihood  estimator  of  Y,  the  lag-1  serial 
correlation  was  computed  by  the  iteration  scheme  given  in  Section 
III.E.6  for  simulated  data  from  the  BELAR( 1 )  process.  Results  of 
the  simulation  appear   to  indicate  that   the  estimator   is  converging 
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to  a  Normal   distribution  with  a  mean    value    equal    to   the    true    Y. 
In   comparison   to   the    least   squares    estimator,    the    simulation 
results    indicate   that    the   maximum    likelihood   estimator    has    a 
smaller   variance   and  bias   at   all    values   of   Y. 
2.      Estimators   of   Location 
a.      Introduction 

The  sample  median,  m,  and  the  sample  mean,  X,  are  two 
commonly  used  estimators  of  the  location  parameter,  \i,  in  a  stationary 
process  with  a  symmetric  marginal  distribution.  The  sample  median  is  a 
particularly  attractive  alernative  to  X  when  the  symmetric  distribution 
is  also  thick-tailed.  (It  is  well  known  that  for  i.i.d.  processes  with 
a  double  exponential  marginal  distribution  that  the  sample  median  is  the 
maximum  likelihood  estimator  of   y) . 

For    i.i.d.    processes,    it   is  well   known   (Dudewiez,    [Ref.    40: 


p.    221])    that  X   has   an  asymptotically  Normal   distribution,    NCOj/oy/n). 


Likewise,    m    is    asymptotically  Normal,    N{0,    /T74nfT(x7)} .      The  results 

A         .D 

for    the    sample   median    hold    provided    f„(x    ^)    is    continuous    in    a 

A  .  0 

neighborhood  around  x  ^,    is  positive,   and   is  bounded  above. 

The  problem  of  estimating  \i  from  dependent  data  is  more 
difficult.  Analytical  results  exist  about  the  limiting  distribution  for 
X  in  ergodic  processes  and  for  the  sample  median  for  processes 
satisfying  certain  mixing  conditions.  (Mixing  processes  are  those  for 
which  random  variables  "sufficiently  far  apart"  are  approximately 
independent) . 
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Since  the  BELAR(I)  process  is  an  RCA(1)  process  with  i.i.d. 
innovation  and  random  coefficient  processes,  {X  },  is  ergodic  (Nicholls 
and  Quinn  [Ref.  16:  p.  37]).  Therefore  X  is  still  an  unbiased 
asymptotically  Normal  estimator  of  y,  but  the  variance  is  modified  by 
the  factor 


1    +   2    I      Y      =    (1+T)/(1-Y) . 
k  =  1 


(III. E. 2.1) 


See,   for   example.    Priestly   [Ref.    33:      p.    3^33. 

The    problem    of    estimating   the   median  has   been  studied  for 

cases    where   the    data   are    dependent.      From   Heidelberger    and    Lewis 

[Ref.     41)],    we   have    that    the    usual    order    statistic    point    estimate 

(sample  median)    is    still    valid,    but    the    variance    is   modified    by    a 

factor,   p(x      ).      Here  p(x      )    is  the   initial   point   on  the  spectrum   of   the 

binary  process    {I   (x   ^)},   where 
n      .0 


I„(x)    . 


1      if   X     <   X, 

n 

0     otherwise. 


(III. E. 2. 2) 


That   is 


p(x   ^)    =   lim  n  Var    {    T      I.(x   _)/n} 
.5  n^°°  .,      1      .5 


(III. E. 2. 3) 


As  was   already  pointed  out,   conditions    for    convergence    and 
Central    Limit    Theorems    for    the    sample    median    depend   on  mixing 
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conditions.     There  are  several   kinds  of    mixing   conditions.       It    is   not 
known,  however,    if   the  BELAR( 1 )    process  satisfies  any  of   them. 

However,  the  LAR(1)  process  does  satisfy  the  mixing 
conditions  of  Gastwirth  and  Rubin  [Ref.  14].  Thus,  for  the  LAR(l) 
process,  it  is  known  that  the  sample  median  has  an  asymptotic  Normal 
distribution  with  mean  zero,   and  variance  given  by 


+  00 


II  II  II  (        \ 

I      {yl'^lcoshCx   ^yl"!)    +  sinh(x   ^yI*"')}    =    H-^  .  (III. E. 2. 4) 

Gastwirth  and  Rubin  [Ref.  14]  showed  that  for  the  LAR(1) 
process,  the  asymptotic  variance  of  X  is  twice  that  of  the  sample  median 
across   all   values   of   serial   correlation. 

The  question  here  is,  what  are  the  properties  of  the  sample 
median  in  estimating  u  from  data  of  the  BELAR( 1 )  process?  Also,  how  does 
the  sample  median   compare  to  X  in  the   BELAR(I)    process? 

Since    {X    }    from    both  the   BELAR(I)    and  the  LAR(  1 )    processes 

n 

have  a  marginal  Laplace  distribution  and  first-order  autor egr essi ve 
correlation  structure,  the  hypothesis  is  that  the  sample  median  from  the 
BELAR( 1 )  process  behaves  similarily  to  that  generated  from  data  in  the 
LAR(  1  )  process.  Also,  the  relative  efficiency  of  m  to  X  is  the  same  in 
the  two  processes. 

To  substantiate  this  assumption,  the  sample  median  and 
sample  mean  were  compared  in  simulation  experiments  in  SIMTBED  for  data 
generated  from  the  BELAR( 1 )  process.  The  simulation  output  is  compared 
to  the  theoretical   results  for   the  LAR( 1 )    process. 
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b.     Simulation  Results 

For  a  =  .1  and  a  corresponding  correlation  coefficient  of 
Y  =  .17664,  the  estimators  X  and  m  were  simulated  in  SIMTBED  using  a 
size  of  Type  III  from  Table  III.  E.  1.1.  The  results  are  given  in  the 
Summary  Statistics  in  Table  III. E. 2.1.  Looking  at  Table  III. E. 2.1  for 
N  =  1 00  and  greater,  there  is  no  evidence  of  non-Normality  from  the 
first  four  estimated  moments  of  the  sample  mean.  The  leading 
coefficient  in  the  asymptotic  expansions  for  E(X)  and  Var(X)  do  not 
deviate  significantly  from  the  theoretical  values,  i.e.  X  is  unbiased 
and  Var(X)   =   2.8581  /N. 

Looking  at  Table  III. E. 2. 2,  the  Summary  Statistics  at"  a  =  .1 
for  m,  it  appears  that  even  for  N  =  25,  m  is  unbiased  and  the  sample 
skewness  is  fluctuating  about  zero.  The  variance,  however,  at  each 
subsample  size  up  to  N  =  250  deviates  significantly  from  a  hypothetical 
asymptotic  variance  of  1.4291 /N,  the  corresponding  result  for  LAR(1). 
This  is  explained  by  the  kurtosis  of  the  estimate  m  of  the  median  which, 
although  decreasing  with  increased  subsample  size,  is  still 
significantly  different  from  0  until  N  =  250.  The  leading  coefficients 
in  the  expansions  for  the  expectation  and  for  the  variance  are  not 
significantly  different  from  0  and  1.4291  respectively.  Since  the  data 
are  only  slightly  correlated,  we  could  have  expected  the  sample  median 
to  behave  similarily  to  that  of  the  case  of  the  completely  random 
process  with  Laplace  marginals,  i.e.  m  is  unbiased,  asymptotically 
Normal,   and  has   a  variance  with  leading  coefficient    1/n. 
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For  values  of  a  =  .5  and  .844,  with  corresponding  Y  =»  .53662 
and  .89986,  using  Type  II  experiments  as  described  in  Table  III.  E.  1.1, 
we  again  compared  the   behavior   of  X   and  m. 

From  Tables  III. E. 2. 3  and  III. E. 2. 4,  we  see  that  the 
behavior  of  X  is  as  expected.  The  sample  mean  appears  to  be  unbiased. 
For  N  ^  250,  there  is  no  evidence  of  non-Normality.  The  estimates  of 
the  leading  coefficient  in  the  asymptotic  expansions  for  the  variance 
agree  within  one  standard  deviation  of  the  postulated  values  of  9.0  and 
38. 

The  corresponding  results  for  m  are  given  in  Tables 
III. E. 2. 5  and  III. E. 2.6.  The  sample  median  shows  no  bias  and  appears  to 
be  asymptotically  Normal  after  N  ^  250.  In  each  case  (a  =  .5  and 
a  =  .844)  the  leading  coefficient  in  the  expansion  for  the  variance  is 
smaller  than  the  corresponding  value  for  the  variance  of  the  sample 
median  in  the  LAR(l)   process,    i.e.    4.5  and   19  respectively. 

The  analysis  thus  far  has  indicated  that  at  least  for  data 
with  non-negative  correlation  in  the  BELAR(I)  process,  there  is  little 
evidence  to  suggest  that  the  behavior  of  the  sample  median  is 
significantly  different  than  in  the  LAR(1)  process.  From  Table 
III. E. 2. 7,  we  see  the  same  kind  of  results  that  Gastwirth  and  Rubin 
[Ref .  14]  reported.  As  sample  size  increases,  the  efficiency  of  X 
relative  to  m   drops   to  50^. 
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TABLE    III. E. 2.7 
Efficiency  of  X  Relative  to  rn  in  BELAR(  1 )    for   Y  >  0 


N 

T  =   +.1766 

25 

.64 

50 

.58 

75 

.55 

100 

.54 

125 

.52 

175 

.53 

250 

.51 

500 

.50 

T  =  +.63662 

.69 
.58 
.55 
.52 

.50 
.49 

.47 
.47 


T  =   +.9 

.98 
.81 
.73 
.67 
.62 
.57 
.53 
.48 


1.  For  T  =  +.1766  the  results  are  based  on  a  Type  III  experiment. 
For  the  other  two  cases,  the  results  are  based  on  Type  II 
experiments . 


We  also  simulated  X  and  m  for  negatively  correlated  data 
from  the  BELAR(I)  process.  Type  III  simulations  were  used  for  X  and  m 
at  Y  =  -.63662  and  a  Type  II  simulation  for  X  at  Y  =  -.9.  From  the 
Summary  Statistics  for  X  in  Tables  III. E. 2. 8  and  III. E. 2.9,  we  see  X  is 
unbiased  and  approximately  Normal  for  sample  sizes  greater  than  125. 
Estimates  for  the  coefficients  for  the  asymptotic  variance  are  not 
significantly  different  from  the  theoretical   values   of    .4441    and    .1053. 

From  Table  III. E. 2. 10,  the  most  obvious  point  to  be  made  is  that 
even  for  moderately  negatively  correlated  data,  m  is  not  Normally 
distributed  even  for  subsamples  of  size  500.  The  sample  median  is 
unbiased,  but  the  kurtosis  is  not  decreasing  fast  enough.  The  variance 
of  the  sample  median  even  at  N  =  500  is  almost  certainly  not 
( 1 /N)  ( 1 +Y/1 -Y)  .      However,   the   leading  coefficient   in  the  expansion  for 
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the  asymptotic  variance  is  within  a  standard  deviation  of  the 
hypothetical  values  (1 /N)  (1 +Y/1 -Y)  .  This  would  indicate,  for  the  case 
of  negative  correlation,  a  much  slower  convergence  of  the  sample  median 
to  Normality  than  for  positively  correlated  data. 

For  negatively  correlated  data  from  the  BELAR(  1  )  process,  we 
have  observed  that  X  does  not  lose  efficiency  relative  to  m  as  fast  as 
for  non-negatively  correlated  data.  In  fact,  from  Tables  III. E.  2.8  and 
III. E. 2. 10,  it  is  clear  that  the  variance  of  X  is  smaller  than  m  for 
subsample  size  N   <   100. 

3.      Estimators   of  Scale 

a.      Introduction 

In    the    case    of    estimating    the    scale    parameter,    X,    we 

considered   three   estimators.      Since   Var(X    )    =   2X^,    we    considered 

n 


X      =  S//2     where 


1      ^ 
S^    =  rj     I      (X.    -  X)^  (III. E. 3.1) 

i=1         ^ 


Since  the  maximum  likelihood  estimator  of  X  for  an  i.i.d.  sample  with 
marginal  Laplace  distribution  is  the  sample  mean  absolute  deviation 
about   the  median,  we  set 


N 
X      =  -     I      |X      -  m|  .  (III. E. 3. 2) 

1=1 
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As  a  third  alternative,  we  chose  the  scaled  median  absolute  deviation 
about  the  median, 


A       r  I X  .  -  m 

'3■T\^k3^^\■  (in.E.3.3) 


The    scaled    median    absolute   deviation   is   a   frequently   used   robust 
estimator   of   scale   [Ref.    38].      In  the  simulations,    we    assumed   that    X 
are   Laplace  with  median  =  mean  =  0  for   all   n.      Table  III. E. 3-1    contains 
a  summary  of   the  type  simulation   (as   defined    in   Table    III.E.1.1),    the 

^  A  A 

estimator(A    ,X    ,X    )   and  the   values   of   a  and  Y  that   were  used. 

TABLE    III. E. 3.1 
Summary  of  Simulation  Schedule  for  Estimators  of   X 
Y  -.89986  .17664  .63662 

a  .844  .1  .5 

Estimator 

A,  Type   II  Type   III  Type   I 

X  Type   II  Type   III  Type   I 

X-  Type   II  Type    III  Type    I 

b.      Simulation  Results 

In  the  Type  III  simulation  (See  Tables  III. E. 3-2  - 
III. E. 3. 4),  using  slightly  correlated  (Y  =  .17664)  realizations  of  the 
BELAR(I)  process,  we  found  the  best  estimator  of  X  to  be  X  ,  the  sample 
mean  absolute  deviation.  It  appears  to  be  unbiased  for  all  subsample 
sizes.      The    skewness   and  kurtosis  are  decreasing  with  increased  sample 
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sizes.  3ut  even  for  N  =  500,  the  skewness  is  still  significantly 
different  than  0.  Using  two-sided  t-tests  with  18  degrees  of  freedom 
for  the  equality  of  means  of  two  Normal  populations  with  unknown 
variances  at  the  90%  confidence  level,  we  reject  each  of  the  hypotheses 
independently  that:  (1)  Var(A  )  =  Var ( X  )  and  (2)  Var(X^)  =  Var(A  ). 
The  mean  relative  asymptotic  efficiency  of  X-  and  X  to  X  are  estimated 
from  the  regression  on  variance  coefficients  to  be  76}  for  X  and  60$ 
for    X    . 

Both  X  and  X-  appear  from  the  simulation  to  be  biased. 
From  the  second  coefficient  in  the  mean  of  regression  on  average  in 
Table  III. E. 3.2,  X.  app)ears  to  have  a  negative  bias  of  order(1/N).  From 
Table  ZII.E.3.^  it  appears  that  X_  has  a  positive  bias  of  order(1/N). 
However,  since  the  leading  term  in  the  expansion  of  the  mean  for  both 
estimators  is  the  true  value  of  T,  it  appears  that  both  X  and  X  are 
asymptotically  unbiased. 

When  we  considered  moderately  to  highly  correlated  data  (see 
Tables  111.1.3.5  -  111.1.3.10),  the  differences  in  the  behavior  of  the 
estimators  were  not  as  easy  to  discern.  The  particular  bias  of  X  and 
A^  was  even  more  apparent,  especially  at  the  smaller  subsample  sizes. 
As  \y\  increased,  so  did  the  expressions  for  the  asymptotic  variances. 
At  each  of  the  subsample  sizes,  ir.  Doth  types  of  correlation,  X  had  the 
highest  estimated  variance.  The  variance  of  X  was  significantly 
different  than  that  of  X  at  all  levels  of  significance  and  subsample 
sizes  up  to  N  =  500.  However,  we  could  not  reject  that  the  asymptotic 
variances  of  X  ,  \  and  X  were  the  same  at  each  of  the  two  levels  of 
correlation. 
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4.   Least  Squares  Estimation  of  Serial  Correlation 

In  this  section,  it  is  assumed,  unless  otherwise  stated,  that  X 

n 

has   a  standard  Laplace    (u  =  0,X   =   1)   distribution.      If  not,   standardize 

X     by 

n     ^ 


X     -  u 

X'    =  -^ ,  (III. E. 4.1) 

n 


where    \x  and  \  will   be  specified  from  those  estimators  already  discussed 
in  III.E.2  and  III.E.3. 

The  least  squares  estimator  of  the  lag-1  serial  corrlation,  Y  , 
is  derived.  First,  we  show  that  the  BELAR( 1 )  process  is  an  RCA(l) 
process  of  Nicholls  and  Quinn  [Ref .  16].  Then,  we  define  the  linearized 
residual  in  the  BELAR(I)  process  and  state  some  of  its  properties.  From 
these  properties  and  some  results  from  Nicholls  and  Quinn  for  RCA 
processes,   we  derive  the   asymptotic  properties   of    Y    _ .       The    properties 

Li  O 

of    T        are   observed   also  in  the  simulation  results   for  selected  values 

Lt  o 

of   T.      Finally,   the  joint   least  squares   estimator   of   location  and  serial 
correlation  are  derived  for   the   BELAR( 1 )    process. 

Rewriting    (III. D. 1.1)    by  adding  and  subtracting  YX  _    ,   we  have 

X^    =    YX^        +    {Ay^(a,1-a)    -   Y}X^    .    ^    e^  ,  (III. E. 4. 2) 


1  /  2 
where  Y  =  E{A     (a,1-a)}  as  given  by  (III. C.  2.  3)  for 

1  /2 
i    =  1  ;{A    (a,1-a)  -  Y}  is  an  i.i.d.  process  stochastically  independent 

of  the  i.i.d.  [e    }.      The  variance  of  the  random  coefficient  is  (a  -  Y^) 
n 
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for  all  n.  As  can  be  seen  from  (III. C. 2. 5)  and  the  fact  that  0  <  a  <  1, 
if  we  know  a,  then  we  also  know  |y|  and  vice-versa.  That  is,  in  the 
BELAR(I)  process,  there  is  only  one  independent  parameter  to  estimate 
for   the   correlation.      Now,   we  recognize    (III.E.4.2)    immediately   as    an 

RCA(1)    process    of    Nicholls    and   Quinn    [Ref.    16].       Since    {e    }    and 

1  /2 
{A        (a,1-a)    -   Y}    are   each    identically   distributed  as    well    as    being 

serially   independent   and  independent   of   each  other,  we  have   by  theorem 

2.7   [Ref.    16]   that    {X   }    is  the   unique  strictly   stationary   and   ergodic 

solution  to    (III. E. 4. 2). 

There  are  two  ways   to   look   at    the    linearized   residual    in   the 

BELAR(I)    process  just  as   described  in  Chapter   II    for  the  NLAR(1)   model: 


R      =    {A^^^(a,l-a)    -   T}X      ,    +    e    ,  (III. E. 4. 3! 

n  n  n-1  n 

or 


R     =   X     -   YX      ,.  (III. E. 4. 4) 

n         n  n-1 


From    (III. E. 4. 4),    we  see  that  since    {X   }    is  strictly  stationary,   so  is 

(R   }.      Also,   we  see  E(R    )    =  0  and  Var(R    )    =  2(1-Y^).      Lawrance   and  Lewis 
n  n  n 

[Ref.    22]    proved   that    the    R      are    uncorrelated ,    but    in    general,   not 

independent.      From    (III. E. 4. 3),   we  note   that    for    any   n,    R    *    e      unless 

a   =    0.       Except    for    when  a  =  0  or    1,    Var(R    )    >   Var(e    ).      As  a  increases 

n  n 

from   zero  to  one,   both  Var(R    )   and  Var(e    )   decrease   monotoni  cally   from 

n  n 

two   to    zero.      This    is    evident    from   the   definition  of   Y  in    (III. C. 2. 5) 
with   Jl  =   1  . 
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Two  other  properties  of  {R  }  are  obtained  from  (III. E. 4. 3)  by 

conditioning  on  the  independent,  identically  distributed  processes  ie    } 

1  /2 
and  {A   (a,1-a)  -  Y}  up  to  time  k  =  n  -  1.  We  have 


E[R^|{ej^,Aj^^^(a,1-a)  -  T}  ;   k  =  1,2,...,n-1] 

=  X  ^Eik'^^^ia   1-a)  -  Y}  +  E(e  )  =  0,       (III. E. 4.5) 
n-1   n  n 


1  /2 
because    {A        (a, 1-a)    -  Y}   and  e     are  independent   of   the  process   through 
n  n 

time  n-1   and  X  _     is  a  function  only  of   the  process   through  n-1. 


E[R^|{e^,    A^J^^(a,1-a)    -   Yl  ;    k   =   1,2,...,n-1] 

=   E(e^)    +  x^    .E[{Ay^(a,1-a)    -   Y}^] 
n  n-1  n 


=   2(1-a)    +x^_^(a-Y'),  (III. E. 4.6) 


which    is   only   a    function   of    a   or    Y^    alone,   since   a  determines   Y^   and 
vice-versa . 

Now   using    (III. E. 4. 4)    and  a  given  realization  of    {X    }   of   size  n, 

n 
we  minimize      T   R?    with  respect    to    Y   to   obtain   the    conditional    least 
i=2    ' 

squares   estimate  for   Y.      This   is  the  same  procedure  as   described  for   the 
NLAR(1)    process.     We  have 


^  n  n 

Y        =   (    I     X  X        ]   /    (    I     x^    ]    .  (III. E. 4. 7) 

LS  .^2      11^  i=2      '    ^ 
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Two  problems  can  occur   using   (III. E. 4. 7),    especially   for    small 

sample    sizes.       For    the    BELAR(I)    process    defined    by    (III. E. 4. 2), 

1    >    T   ^    0,    and   yet    it    is    possible   that    Y        <    0    or     l^'r^l     >    1-       If 

-1    <    Y        <    0,    we  would   estimate   that    the    sample    {X    }    came   from   the 
Ljo  n 

1/2  r  "  I 

BELAR(I)    process  with  the  negative  sign  on  A        (a,1-a).      If    |y|     >    1  ,    we 

would  estimate  Y  by   +1    or  -1. 

In  order  to  obtain  the   "least  squares"   estimate  for   a,    we   solve 

numerically  for   a        in 


r(ct_+i/2) 
Y.J  = -^-   -rr^S — '  (iii.E.ij.8) 

/    TT  LS 


for   a  given   Y       from    (5.7)    if    |y      |    <    1. 

The    estimator    Y    _    given    by    (III. E. 4. 7)    has    the    following 

Lj  O 

properties   which  we  state  as   a  corollary  to  Theorm   3.1    [ Ref .    16]: 


CORROLLARY.  For  {X  }  given  by  (III.E.M.2);  {R  }  in  (III. E. 4. 3) 
and  (III. E. 4. 4),  the  least  squares  estimator  y  ^  has  the  following 
properties : 


a)     Y^3-2^5..  ,, 


b)      Since   E(X'*)    =  24    <    »,    ( ^-i  )  ^  ^^  ( Y      -y)    has    a    distribution 
n  LS 

which    converges    to    the    Normal    with   a   mean   of    zero   and  a   variance    a5, 
given   by 
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a^  =   1+5a-6Y^ 


(III. E. 4. 9) 


The  proof  follows  from  Theorem  ( 3.  1 )  .      The   strict    stationarity 

and  ergodic   nature   of    {X    }    leads   to  the  almost  sure  convergence.      The 

n 

results  of  (III. E. 4.5)  and  (III.E.M.6),  together  with  Bill ingsley ' s 
Martingale  Central  Limit  Theorem  provide  the  results  for  the  asymptotic 
Normality  of   Y      . 

A  strongly  consistent  estimator  for  the  variance,  a?.,  is  also 
given  in  [Ref  .  16]  for  the  general  RCA(1)  process.  For  o^  in 
(III. E. 4. 9),   this  estimate  becomes 


0^    = 
Y 


(n-1) 
n 

i:2   '-' 


''-\s^ 


n 

I 
i  =  1 


X^    + 


n 

(a     -Y^    )      ^      i-1 
^    LS      LS^    i=2 

n 
i=2    '-' 


.    (III.E.4.10) 


For   large  n    (III.E.4.10)    is  approximated  by 


Y 


^'-\s'   ' 


^--'^^\s-ks^  1=2'^-^ 


1=2    '-' 


(III. E. 4.1 1) 


where  Y       is  from    (III. E. 4.7)   and  a  „    (III. E. 4. 8). 

Simulations  of  the  least  squares  estimator  of  Y  were  conducted 
for  selected  values  of  Y  in  SIMTBED  using  Type  III  plans.  The  results 
are   summarized    in   Tables    III. E. 4.1,    III. E. 4. 2   and    III. E. 4. 3.       The 
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results  reflect   the  theoretical    behavior    of    the   estimator    as    derived 
above. 

We  note  that  the  joint   conditional  least  squares   estimators  of   y 

and    Y    in   the    BELAR(I)    process    are   the   same   as    in   the   linear    AR(1) 

n 
processes.     Minimizing  the  sum  1  R?  where  now 

i=2  ^ 


R.    =    (X.-u)    -   Y(X.^^-u),  (III. E. 4. 12) 


leads   to  the  following  joint   estimators  for   \x  and  Y 


n  /s     n  ^ 

y   =    [    I  X     -   Y      I  X        )      /    (n-1)(1-Y),  (III. E. 4. 13) 

i=2  i=2 


^  n  n 

Y   =         I    (X   -y)(X        -u)       /      I    (X        -y)^         (III.E.4.U) 
i=2  i=2 


For  large  n  these  equations  reduce  to  the  familiar  ones 


U  =  X  (III. E. 4. 15) 


^    n     _      _     n      _ 

Y  =   y  (X.-X)(X.  ,-X)  /   y  (X.  ,-'X)2.     (III. E. 4. 16) 
i=2   ^     ^"       i=2   ^" 


We   now    turn   in   the    next    section    to    the    question    of    alternative 
estimators  for   Y  given  that   y  =  0  and   A   =   1  . 
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5.      Other  Estimators   of   the  Lag-1    Serial  Correlation 
a.      Estimators  Based  on  a  Non-linear  Residual 

In  this  section,  we  explore  other  possibilities  for 
estimating  Y  in  the  BELAR( 1 )  process.  There  is  a  question  as  to  why  one 
should  use  the  linear  residual  since  the  BELAR(  1 )  process  is  a  random 
coefficient  process  which  is  non-linear.  Secondly,  why  should  you 
minimize  the  square  of  the  linear  residual  as  opposed  to  minimizing  some 
other  symmetric  loss  function  which  is  a  function  of  the  linear 
residual?  The  answer  to  both  questions  is  that  the  least  squares 
estimator  of  Y  based  on  the  linear  residual  out- per  formed  other 
estimators   in  the  simulation  experiment. 

Consider   the  following  types   of   non-linear  residuals 


»  2 

R      =   X   -YX        -6(X   -2) ,  (III. E. 5.1) 

n         n       n-1  n 

R'    =   X^-YX^    1-BX'      iSign(X^    .) ,  (III. E. 5. 2) 

n  n       n-1  n-i  n-i 


From    (III. E. 5.1),    it  follows   that   R     has    zero  mean   and 

n 


Var(R    )    =   2(1-Y^+106^)  ,  (III. E. 5. 3) 

Cov(R    ,R      J    =   20a6^.  (III. E. 5. 4) 

n     n-1 


Introducing  the  extra  parameter,  Bi  makes  the  residuals,  R  ,  correlated 
unless  a  =  0  or  6  =  0 .  If  6  is  zero,  then  we  again  have  the  usual 
linearized  residual  in  (III. E. 4. 4).  If  6^  =  Y^/10,  then  the  variance  is 
a    constant,    but    the    residuals    are   still    correlated.       It    is    easy  to 
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compute  the  least  squares  estimators  for  T  and  6  from  (III. E. 5.1)  and 
(III. E. 5. 2).  We  simulated  the  estimators  of  Y  and  6  and  compared  them 
to  the  results  based  on  (III. E. 4. 4)  with  6  =  0.  From  Table  III. E. 5.1, 
we  see  that  the  different  estimators  of  T  from  all  three  residuals  are 
close  to  the  true  Y.  The  result  is  that  the  estimates  of  6  are  very 
close  to  zero. 

To   see   how  much  the   value  of   Y  could  change  with   6  fixed  at 
some  non-zero  values,   we  simulated  the  least  squares   estimator   of   Y  with 

6    =   0   and   the    estimator    of    Y   based   on    (III. E. 5.1)   with   6   =  Y/  1 0  and 


again   with    6    =      -Y//    10.       From   Table    III. E. 5. 2,    we   see   that     6    '^    0 

severely  alters    the    estimate  of   the  serial   correlation.      Therefore,    in 

the  remainder   of   this   subsection,   we  consider   alternative  estimators  for 

Y  in  the   BELAR( 1 )    process   to  be  only  those   based  on  the   linear  residual. 

b.      Estimators   Based  on  the  Linear   Residual,    R 

n 

Besides   the   asymptotically  unbiased  least  squares   estimator, 

we  considered  the  following  well-known  estimators   of    Y    in    linear    AR(1) 

models: 

1)      The  Huber(c)   function  as   described  by  Denby  and  Martin   [Ref.    38]. 

The    estimator,    Y„ ,    is    the    value    of    Y   that    satisfies   the 

H 

equation 


n 

I  X.    ,4'„(x.-Yx.    ,)    =  0,  (III. E. 5. 5) 

^    1  —  I     n       1  1"  I 

1=2 
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TABLE    III. 3.5.1 

Simulation  Results  for  Various   Definitions  of  R      in  BELAR(  1 ) 

n 


1. 

N  =   500 

a  =    .5 

DATA 

B   =  0 

Y 

XI 

.56891 

0 

.57192 

X2 

.61996 

0 

.61630 

X3 

.62651 

0 

.62604 

X4 

.57995 

0 

.58374 

X5 

.59236 

0 

.59233 

AVG 

.59754 

.59807 

STD 

.02499 

.02257 

BIAS 

-.03908 

-.03855 

Y  =   Corr(X    ,X      ,  ) 
n     n-1 


,63662 


00279 

.62082 

-.01771 

00815 

.56054 

+  .01637 

00358 

.78189 

-.05808 

01865 

.75716 

-.07208 

02100 

.70995 

-.04748 

00829 

.68607 

-.03580 

01154 

.09330 

.03535 

00829 

+.04945 

-.03580 

2. 

N  =    1000 

a  =    .5 

Y  =   Corr(X    ,X      ,  ) 
n     n-1 

=    .63662 

DATA 

\s 

6   =  0 

Y 

B                          Y' 

6' 

Y1 

.63026 

0 

.62955 

-.00423              .62985 

.00013 

Y2 

.67422 

0 

.65653 

.02520               .59178 

.03095 

Y3 

.62566 

0 

.62921 

-.00590               .59646 

.01093 

Y4 

.67738 

0 

.67777 

.00233              .60522 

.02359 

Y5 

.64664 

0 

.64784 

-.00560              .62841 

.00581 

AVG 

.65083 

.64818 

.00236              .61034 

.01428 

STD 

.02411 

.02032 

.01320              .01782 

.01273 

BIAS 

+  .01421 

+  .01156 

+.00236            -.02628 

+  .01428 

3. 

N   =    1500 

a   =    .75 

DATA 

6    =   0 

Y 

Z1 

.81183 

0 

.81671 

Z2 

.80699 

0 

.80700 

Z3 

.81777 

0 

.81795 

Z4 

.85279 

0 

.85569 

AVG 

.82235 

.82434 

STD 

.02077 

.021 47 

BIAS 

-.01229 

-.01029 

Y=Corr(X^,X^.,) 


00797 

.86364 

00040 

.82072 

00160 

.83399 

00728 

.89116 

00033 

.85238 

00629 

.03147 

00033 

+  .01775 

,83463 


8' 

,01821 
,00511 
,00641 

,00193 

,01041 

00598 

,01041 
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TABLE    III. E. 5. 2 

Simulation  Results  for  Various  Definitions 

of  R     to  Estimate  Y  Given   g  in  BELAR(I) 
n 


N  =  500; 

DATA 

(y^s  ^  =  °' 

1 
2 
3 
4 

5 

.56891 
.61996 
.62695 
.57995 
.59236 

a   =    .5 


y  =    .63662 


[y' 


e  = 


T 


/To 


27552 
21515 
38621 
3^356 
36082 


-T 


/To 


.27^10 
.26257 
.38450 
.39730 
.40557 


where 


T„(t)    . 


t  if    |t 1    <   c, 

c  Sign(t)      if    It  I    >  c. 


(III. E. 5. 6) 


The  corresponding  weight  function  w  (t)  is  H'„(t)/t  and  c  is 

ri  H 

a  tuning  constant.      As   c  goes   to  infinity  4'„(t)   approaches    t    and    Y„    is 

H  n 

the    least    squares    estimator    of    T.       If    c    =  0,    we  have  the  solution  of 
(III. E. 5. 5)    is  the  median  of   X./X._    . 

For  c  other  than  0  or  ">,  there  is  no  closed-form  solution  to 
(III. E. 5. 5).  We  obtain  the  Huber(c)  estimator  of  Y  by  iterating  the 
following  scheme: 


k+1 


"  (VVi-1 

1=2 ^  r 

n  j-x.  -Y  X.       ^ 

..^1-,  "h[  '    s/'  i 


(III. E. 5. 7) 
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where  T     is   the  least  squares   estimator   of   T  and 


median    |X . | 
^  -       .69315  '      •  (III.E.5.8) 


is    the    scaling  constant    for    the    R..       If    Y  =  0,    then  S     is  the  median 

1  r 

absolute  deviation  estimator  of  the  scale  parameter  in  the  Laplace 

distribution  as  given  in  Section  III. £.3-   Typical  values  of  c  are  1, 

1.5,  2.  We  use  for  illustration  c  =  1  in  the  simulation  along  with  Y  , 

the  least  squares  estimate,  and  Y.,,  the  median  (X./X.  ,). 

M  1      1-1 

2)      The  Least  Absolute  Deviation    (LAD)   estimator   of   Y  is   the  minimizer 
of 


n 

I    Ix.-Yx.    ,  I  .  (III. E. 5. 9) 

i=2      '        '-'^ 


The   solution   is,    Y,,,,,    the  weighted  median  of   x./x.    ,    where 

WM  11-1 

the  weights   are  x._     for   i   =   2,...,n. 

Denby  and  Martin  [ Ref  .  38]  reported  that  the  Huber(c) 
estimates  are  consistent  and  asymptotically  unbiased  for  linear  AR(1) 
models.  Bloomfield  and  Steiger  [Ref.  39]  showed  that  the  LAD  estimator 
is  strongly  consistent  and  asymptotically  unbiased  for  linear  AR( 1 ) 
models.  In  Figures  III. E. 5.1  -  III. E. 5. 4  are  examples  from  SIMTBED  of 
the  behavior  of  these  estimators  in  simulated  data  from  LAR( 1 ) ,  a  linear 
AR(  1 )  model  with  Laplacian  marginals  and  AR( 1 )  correlation  structure 
given  in  Chapter  II.  These  results  appear  to  be  consistent  with  the 
results  reported  above  for  linear  AR( 1 )  processes.  The  leading 
coefficient   in  the  expansion   for    the   mean   of    each    estimator    does    not 
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differ  significantly  from  the  true  value,  0.63662.  We  also  see  that  the 
median  (X./X._.)  and  the  weighted  median  (X./X._  )  estimators  are 
considerably  more  efficient  than  either  the  Huber(c)  estimator  in  Figure 
III. E. 5. 3  or  the  least  squares   estimator    (c   =  «)    in  Figure  III.E.5.^. 

Since  the  least  squares  estimator  remains  asymptotically 
unbiased  for  the  BELAR(I)  process  as  was  shown  in  Section  III.E.4,  it 
was  of  interest  to  observe  how  the  Huber(c)  estimators,  c  <  »,  and  the 
LAD  estimator  of  T  would  behave.  Considering  the  ordering  suggested  by 
the  simulation  results  in  the  LAR(1)  process,  it  would  seem  possible 
that  the  Huber(c)  estimates  could  be  better  than  the  least  squares 
estimator  of  Y.  In  the  boxplot  analyses  in  Figures  III. E. 5. 5  - 
III. E. 5.8  are  the  results  of  the  simulation  for  T  =  .63662,  but  for 
data  from  the  BELAR(I)  process.  The  boxplots  in  Figure  III. E. 5. 5 
display  the  theoretical  behavior  of  the  least  squares  estimator  of  T. 
The  other  estimators  of  Y  appear  to  be  converging  to  other  values 
Y_  *  Y.  To  see  this,  note  the  first  entry  in  the  coefficients  for  the 
asymptotic  expansion  of  the  mean  of  Y  in  Figures  III. E. 5.6  -  III. E. 5.8. 
In  each  case  Y  >  Y.  Also  from  the  estimate  of  the  standard  deviation, 
we  assert  that  Y  is  significantly  larger  than  Y  for  each  of  the 
alternative  estimators  investigated  here,  because  the  difference, 
|y   -  Y„|  ,    is  larger   than  four  standard  deviations. 

For  the  BELAR(  1 )  process,  we  observe  a  reversal  from  the 
LAR(  1  )  process  in  preference  for  the  estimator  of  Y.  We  will  use  the 
least  squares  estimator  as  the  initial  estimator  of  Y  in  the  iterative 
procedure  for  finding  the  maximum  likelihood  estimator  of  Y  which  we 
develop  next . 
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6.      Maximum  Likelihood  Estimation  of   Y 

a.      Introduction 

In  this  section,   we  develop  the  maximum  likelihood  estimator 

of   the  lag-1    serial   correlation  in  the   BELAR( 1 )    process,    T         .      We    use 

MLhi 

the   expression   for    the    logarithm  of   the   likelihood  function,    L(a)  ,    in 

(III. D. 2. 12)    in  an  iterative  procedure  to  find  the    values    of    a   and   the 

1  /2 
sign    of    A         (a,1-a),    that  minimizes   -L(a);    call   the   pair    (cL,.  p,    sign). 

1  /2 
Since   knowing  a  and  the  sign  of  A        (a,1-a)   uniquely  defines   T,    Y^..  „  can 

n  nLh# 

be  found  from    (III. E. 4.8)   using    (a.„  ^,    sign). 

MLE 

We  consider  only  the  univariate  problem.  That  is,  we  have 
assumed  that  {X  }  is  marginally  Laplace  distributed  or  have  determined 
from  Q-Q  plots  that  the  best  S,-Laplace  fit  to  the  data  is  when  J,  =  1  . 
Secondly,  we  assumed  that  {X  }  is  standard  Laplace  (y  =  0;  X  =  1)  or 
that  {X  }  has  been  standardized  using  a  pair  of  estimators  (u,  X)  from 
Sections   III.E.2.    and  III.E.3. 

As  a  function  of  a,  (III. D. 2. 12)  is  very  complicated.  There 
is  little  hope  of  being  able  to  analytically  solve  for  the  critical 
values  of  a.  In  fact,  the  evaluation  of  a  derivative  of  (III. D. 2. 12)  is 
at  least  as  expensive  computationally  as  the  function  values  themselves, 
since  (III. D. 2. 12)  contains  exponential  functions  of  a.  However,  since 
this  is  a  one-dimensional  optimization  problem,  there  are  IMSL  routines 
that  will  perform  the  search  without  using  de  vi  at  i  ves--Golden  Section 
search;    bisection  method;    or   interpolation  routines. 

We  chose  the  IMSL  routine  ZXLSF  which  performs  a  one- 
dimensional  search  for  a  minimum  of  a  smooth  function  in  a  closed 
interval    using    quadratic    interpolation.       The    FORTRAN    routine   which 
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evaluates  ( III. D.  2.  12)  is  formulated  so  that  ZXLSF  is  searching  on  the 
interval  (-1,1)  where  a  <  0  implies  that  conditional  densities  of  the 
form  (III. D. 2. 10)  are  being  evaluated  instead  of  those  given  by 
(III. D. 2. 9)  when  a  >  0.  The  initial  value  for  a  to  start  the  iteration 
procedure. of    ZXLSF    is    a    four-digit    approximation    (a    _,    sign      ) 

Li  O  LiO 

corresponding  to  the  least  squares  estimate  of  serial  correlation,  Y  , 
Obtained  from    (III. E. 4. 8). 

The  queston  of  accuracy  in  the  calculation  of  (III.D.2.12) 
is  especially  important  because  the  likelihood  surface  is  extremely  flat 
in  many  cases.  We  want  some  assurance  that  ZXLSF  is  efficiently 
searching  for  the  optimum  and  not  "chasing  roundoff  errors".  This 
happened  before  we  increased  the  accuracy  parameter  in  DCADRE  and  used 
double  precision.  In  order  to  assess  the  accuracy  of  our  calculations, 
we  constructed  first-  and  second-divided  differences  for  values  of  a  and 
(III.D.2.12).  The  divided  differences  are  approximations  for  the 
derivatives.  For  those  simulations  that  we  checked,  there  was  one 
transition  of  the  slope  through  zero  at  the  critical  point  found  by 
ZXSLF.  The  second-divided  differences  at  all  points  in  the  vicinity  of 
the  critical  value  were  positive  indicating  the  general  convex  upward 
shape  of  (III.D.2.12).  Sometimes  there  was  some  fluctuation  in  values 
of  the  second-divided  differences,  but  no  change  of  signs  near  the 
reported  optimum. 

The  fluctuating  values  of  the  second-divided  difference 
indicated  some  noise  in  the  calculations.  This  occurred  in  two  places. 
If  the  s  eco  nd- di vi ded  difference  covered  points  on  both  sides  of 
a  =   1/2,    then  there  was   often  a  jump   in  the   value  of   the  second-divided 
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difference.  This  occurred  because  of  the  change  in  the  method  of 
calculating  the  conditional  density  when  a  changed  from  a  <  .5  to 
a  ^  .5.  Other  times,  slight  aberrations  in  the  observed  pattern  of  the 
second-divided  differences  occurred  for  values  of  a  that  were  small, 
0  <  a  <  .15.  This  is  attributed  to  the  fact  that  DCADRE  evaluations  for 
the  table  of  values  of  the  ( 1 -a) -Laplace  density  (0  <  a  <  .15)  in  many 
subintervals  was  not  behaving  regularly.  The  computed  value  was 
accepted  because  the  estimated  error  was  small,  relative  to  the  accuracy 
requirements.  The  important  consideration,  however,  was  that  no  error 
in  calculating  (III. D. 2. 12)  should  be  so  large  as  to  falsely  indicate  a 
change  in  convexity  in  the  vicinity  of  an  extremum,  so  that  ZXLSF  would 
be   ineffective  at  locating  it. 

The  selection  of  a  good  starting  point  in  this  procedure  is 
also  important.  It  is  desirable  to  commence  the  iteration  in  ZXLSF  as 
close  to  the  global  optimum  as  possible  in  order  to  reduce  the 
possibility  of  converging  to  a  local  optimium.  Note,  also,  that  as  a 
function  of  a,  the  conditional  density  is  not  necessarily  convex  and 
often   is  not   even   unimodal   across   the  range  from   Y  =   +1    to  Y  =   -1  . 

Since  (III.D.2.12)  is  the  logarithm  of  the  product  of  such 
functions,  there  is  no  assurance  that  (III.D.2.12)  has  a  single  relative 
maximum  especially  for  small  sample  sizes.  When  the  sample  size  is 
small,  it  is  advisable  to  pick  a  starting  value  for  the  iteration  on 
both  sides  of  a  =  0.  Select  the  maximum  likelihood  estimator  to  be  the 
one  with  the  higher  value  of  L(a)  if  the  routine  produces  two  different 
a's,   corresponding  to  the   pairs    (a. ,+)    and   (a„,-). 
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Since  we  know  that  Y.  _  i s  a  consistent,  asymptotically- 
unbiased  and  asymptotically  Normally  distributed  estimator  for  Y,  we 
chose  the  value  of  a  and  model  corresponding  to  Y  as  our  initial  guess 
in  ZXLSF. 

b.      Simulation  Results 

The  maximum  likelihood  routine  for  estimating  Y  was  tested 
in  simulations  using  computer  generated  data  from  the  BELAR(  1  )  process 
with  known  parameter  values  of  i,  u.  ^  and  a.  By  performing  M 
independent  simulations  of  sample  size  N  (where  N  is  increased  for  each 
set    of   M    simulations)  and  fixed  a,   we  were  able  to  compare  the  standard 

deviation   and    bias    (if    any),    of    Y,,,  „    to   that    of    the    initial    least 

MLE 

squares  estimator  Y  ,  for  which  the  asymptotic  distribution  is  Normal. 
Changes  in  the  Normal  plots  for  one  set  of  M  simulations  for  N  small  to 
a   second   set   of  M   simulations  for   a  larger  N   would  give  some  indication 

of   how  fast  Y,,- „   is  or   is  not   converging  to  a  Normal   distribution. 

MLE 

Both  M  and  N  were  small  in  the  simulations  for  two  reasons. 
Since   the    asymptotic   distribution   of    Y        was    known,    it    was    of    more 

Li  O 

interest    to  see  how  much  better   Y,,,  „  was   for   the  smaller  samples    (i.e., 

MLE 

was  the  bias  smaller  for  Y  or  was  it,  in  fact,  unbiased).  Secondly, 
the  run  times  for  calculating  (III. D. 2. 12)  for  N  <  200  was  long.  The 
evaluation  per  sample  of  size  N  =  25  ranged  from  100-300  sees.  For 
N  =   175,    the  run  times   ranged  from  700-950  sees. 

Figures  III. E. 6.1,  III. E. 6. 2  and  III. E. 6. 3  are  the  Normal 
plots  of  twenty  realizations  of  the  maximum  likelihood  estimator  of 
serial  correlation  and  the  least  squares  estimator  of  serial  correlation 
for    simulated    data   from    the    BELAR(  1 )   process   for  selected  values   of   a 
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and  for    two  subsample   sizes,    SSN.      The    layout    provides   for    two-way 

comparisons.     That   is,   Y.„  „  from  smaller  SSN  can  be   compared  to  Y,„  ^  for 
'^  MLE  MLE 

larger  SSN.        Likewise,   for   a  given  SSN,    Y„,  „    can    be    compared   to    y,o» 

which   is    known  to  have  an  asymptotic  Normal   distribution.      The  straight 

line  in  the  Normal    plots    corresponds    to   a  Normal    distribution.      The 

curved   lines    correspond   to   the   Kolmogorov-Smirnoff   bounds   calculated 

from  the  data  at   the    95%   confidence    level    by   the   routine   in   the    IBM 

experimental  APL  routine  called  GRAFSTAT. 

It  appears  from  these  figures   that  for  the  larger    values    of 

SSN,    Y..,  c  ^^^  ^T  c  fit  Normal   distributions   better  than  the   corresponding 
MLh  Lo 

samples  from   the  smaller   values   of  SSN.      It  also  appears   that    T,„  „    fits 
■^  MLE 

a   Normal    distribution  as  well   as   the   y       for  the  larger   values   of  SSN. 

This    supports    the    notion   that    the   maximum    likelihood   estimator    is 

converging  to  a  Normal   distribution. 

Figures    III.E.6.M,     III. E. 6. 5    and    III. E. 6. 6    are    the 

corresponding  scatter   plot   analyses   for   the   data  in  the   previous  figures 

for    the    larger    value    of    SSN.       It    appears    that    Y,„  „    and    Y,  „    have    a 

MLE  LS 

positive    correlation    coefficient   which  becomes  more  pronounced  as    the 

data  becomes   less   correlated.      The   distribution  of   Y„,  „  also   appears    to 

MLE 

have    a  smaller    variance    than    Y,,,.      This  effect   is  more  pronounced  for 

more   highly   correlated    data.      Compare,    for    example,    the    estimated 

standard    deviation   of    Y,„  ^    and   that    of    Y, ^    from    the  table  in  Figure 

MLE  LS 

III. E. 6. 4   with   the    corresponding   entries    in   the    table   from    Figure 
III. E. 6. 6. 
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F.      8.-LAPLACE   MOVING  AVERAGE   MODELS 

1 .  Introduction 

In  this  section,  we  derive  a  time  series  model  that  has  an 
il-Laplace  marginal  distribution  and  the  correlation  structure  of  a 
linear  q  -order  moving  average  model.  This  construction  uses  the 
square  root  Beta-Laplace  transform  given  in  Section  III.B.3.  The  first- 
order  model  retains   the  full   range  of   correlations   up  to   1/2. 

2.  The  First-Order  Moving  Average  Model 

Let    {L   ii-a)}    be    an    i.i.d.    sequence   of    ( Jl-a) -Laplace    random 
n 

1  /2 
variables;     {A         (a,il-2a)}    be    an    i.i.d.    sequence,    independent    of 

{L   (£-a)},   where  A      is  a  Beta   (a,il-2a)    random   variable  and  0   <    a    <    il/2. 

Both   the    innovation   and   the    coefficient    sequences   are  independent   of 

X     ,,  X     ^,    ...    .      Then  the  sequence    {X   {i}   given  by 
n-1        n-2 '  ^  n  ^  ^ 


X    ii)    =   L    H-gl)    +   A^^^(a,il-2a)L      Ai-a),  (III. F. 2.1) 

n  n  n  n-1 


has   a  marginal    H-Laplace   distribution  and  an  MA(1)    correlation  structure 

such  that   0  <   Corr(X    ,X      ,)   <    1/2. 

n'   n-1 

To  see  that  X    (i)   has   an   2,-Laplace   distribution,   first  note  that 

by    the    square   root    Beta-Laplace  transform  theorem  of  Section  III.B.3, 

1  /2 
the   distribution  of    the    product    A         (a,Jl-2a)L   _ Ai-a)    is    a-Laplace. 

Then  note  that  X    (il)    is  the  sum  of   two  independent   random  variables,   one 
n 

of   which   has    an    ( £-a) -Laplace    distribution   and   the    other    has    an    a- 
Laplace    distribution.      So,    if    (j)    (w)    is  the   characteristic  function  of 

A 

X    (£) ,   then 
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♦x("'  -  liT^n  [r^r  ■  (ri^r  •  (iii.F.2.2) 


To  see  that    {X   (i)}   has   the   correct   correlation  structure,   first 
n 

note    that    by    the    construction   of    (III. F. 2.1),    X      ,     is    explicitly 
J  '      n-k  ^  ^ 

independent   of   X     for    Ik  I    >    2.       Therefore,    Corr(X    ,X      ,  )    is    zero   for 
n  '    '  n      n-k 

IM  s  2. 

For   k  =  ±1 ,   we  have,   after  some  simplification 


ar(a+-)r(Jl+1-a) 

Corr(X    ,X        )    =  = —  .  (III. F. 2. 3) 

"^     ^'^  S,r(a+1)r(Jl-a^-^) 


Finally,   note  that   in  the   limit  as   a  ^  0,    (III.F.2.3)    approaches 

zero.      Also,   as   a  ->■   2,/2,    (III.F.2.3)   approaches    1/2. 

Replace  A^  ^^(a,  2,-2a)    in    (4.1)   by  -A^  ^^(a,  8,-2a)  ,    we    have    a   full 
n  n 

range      (-1/2,0)    of   nonpositive  lag-1    serial    correlations. 
3.      The   q-Order  Moving  Average  Model 

The   MA(q)    model    for    q   ^    2   is    the  extension  of   the  MA(1)   model 

given  in   (III. F. 2.1).      Let    {L   ( i-qa) }   be   an   i.i.d.    sequence    of    (i-qa)- 

1  /2 
Laplace   random    variables.      Let    [A      .    {a,  il-(q  +  1  )a}  ]   for   i    =   1,...,q  be 

n ,  1 

i.i.d.   sequences,    independent    of    each   other    and   of    {L    (  2,-qa)  }  ,    where 

A       .     is    a    Beta    {  a ,  il- ( q  + 1  )  a }    random    variable    for    all    n    and   all 
n  ,  1 

i   =   1,...,q.      Also,    0  <   a   <   il/(q  +  1  )  .      Both   the    innovation   and   each   of 

the    coefficient    sequences    are    independent   of   X      ,  ,X     ^ Then  the 

n-1      n-2 

sequence    {X    (i)}   given   by 
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1/2 


X    (J,)    =   L   (Jl-qa)    +      I      A'^{a,J,-(q+1)a}L        (X,-qa)  ,  (III. F. 3.1) 

n  n  .^.riji  ri~  i 


has   a  marginal   2--Laplace  distribution  and  an  MA(q)   correlation  structure 

for    0    <    a    <    8,/(q  +  1).      When    a   =  0,    then    {X    (8,)}    is  an  i.i.d.   sequence; 

when  a  =   2,/(q+1),   then  the  moving  average   is  an  equal    weighted   average 

of   q+1    i.i.d.   a-Laplace  error  terms  L   (a). 

n 

To  see  that  X    (J,)    is  an   S,-Laplace   random    variable,    first    note 

from    the    square   root    Beta-Laplace    transformation   theorem  of  Section 

1  /2 
III.B.3,    that   each  product   A      .{a,    2,-(q  +  1)a}L  _.(2,-qa)   has   an    a-Laplace 

distribution. 

But   the  sum  of   q  i.i.d.    a-Laplace    random    variables    has    a    qa- 

Laplace    distribution.      Thus,   X    (8,)    is   the  sum  of   two  independent   random 

n 

variables   and  its   characteristic  function  is  obtained  as   the   product 


X  [l+w^  J  .^.  (.1+0) 


J U-qa   f J jqa  _    (j U 

1+a)M  l+oiM  1+u)"      • 


(III. F. 3. 2) 


The    correlations    are   truncated   at    lags    |  k  |    ^    q  +  1.       By    the 

construction  of    (III. F. 3.1).   X      is  explicitly    independent    of    X      ,     for 

n  n-k 

Ik  I    i   q+1 . 

Negative    correlations    are   obtainable    with    2      choices    for 

1/2  1/2 

replacing   or   not   replacing   [A      .  {a,  il-(q  +  1  )a}  ]   by   [-A      .  {a,  ll=(q  +  1  )a}  ]    in 

(III. F. 3.1). 
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This   model    can    be    generalized   from    the    one- parameter   case  by 
replacing   qa    in    (III. F. 3.1)    with   a.     in    each    term    in    the    sum,    and 

q 

replacing  L   ( Jl-qa)    by  L   ( 2,-q     E  a.  )  . 

1  =  1 
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IV.      RESIDUAL  ANALYSIS    COMPARISON   OF   THE   NLAR(1 ) 
AND  THE   BELAR(I)    PROCESSES 

A.      INTRODUCTION 

Lawrance    and   Lewis    [Ref.    22]    developed   a    higher-order    residual 

analysis  for   non-linear    time   series    with  autoregressi ve    correlation 

structures.      Specifically,   they  developed  a  third-order   analysis  based 

on  the   cross- correlation  of   the  linear  residual,    R    ,    and    its    square    at 

n  ^ 

lag  k,  R^_.  .  They  applied  the  analysis  to  the  problem  of  modelling  wind 

11      K 

speed  data.      It  is    important    to    note   that    this    analysis   was    done    in 

conjunction  with,   and  not   in  place  of,   the  usual   second-order   analysis. 

As  has   been  already  pointed  out,   second-order   analysis  is  sufficient  for 

modelling  only  when  the   processes   are  both  linear   and  Normal. 

The  residual   analysis  involves   only   joint   moments    of    order    three. 

In   Chapter    II    of    this   thesis,    it  was   shown  that  for  the  NLAR(p)   models 

with  p  =  1,2,   all   the  third-order  moments--that    is,    those  of    the    form 

E(X.X.X    )    for   all    i,   j,   k — are  zero.      Therefore,   the  Lawrance   and  Lewis 
1    J    k 

residual   analysis  will   not   differentiate  between  the    NLAR(p)    processes 

with    the    same    autocorrelation   structure.       It    can    also    be    shown    by 

induction  on  k  that   Corr  (R    ,R^    ,  )    =   Corr  (X^,R      ,)=0intheBELAR(1) 

n     n-k  n      n-k 

process.  Hence,  either  third-order  residual  analysis  will  be  unable  to 
discriminate  the  BELAR( 1 )  process  from  any  of  the  NLAR( 1 )  processes  with 
the  same  autocorrelation  function. 

In  the  spirit  of  looking  at   the   lowest    possible   joint   moments    for 
differentiating  between  models  with  symmetric  marginals,   a  fourth-order 
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analysis  is  proposed.      Two  candidates  are  investigated   as    the    basis   of 

this    analysis.      The  first  one  considered  is   the   cross-correlation  of   X^ 

n 

and  the  linear  residual  at  lag  k,  R   ,  .   The  second  is  the 

n-  k 

autocorrelation  of  R^   and  R^    ,  .      The  two  analyses   are  compared  by  their 

n  n-k  ^  y  J 

abilities    to    differentiate    among   the    different    types    of    NLAR(1) 

processes   and  the   BELAR(I)    process. 

Table  IV.A.1.    contains  a  summary  of    the   models    in   the    comparison, 

along  with   the   selected   sets    of    parameter    values    and    corresponding 

correlation  coefficient.      Even  though  each   of    the   models    has    the    same 

marginal    distribution    (standard  Laplace)   and  identical   autocorrelation 

functions,   each  has   a  theoretical   cross-correlation  function  in  terms  of 

(X',R     ,  )   and  autocorrelation  function  for    (R^.R^   ,  )    that   are  different, 
n     n-k  n     n-k 

The  question  of  how  the  residual  analysis  is  affected  by  parameter 
estimation  is  an   important    issue,   but   is  not   addressed  at   this   time. 

Before  the  candidates  are  developed  in  the  next  two  sections,  it  is 
convenient  now  to  place  both  the  NLAR(1)  and  BELAR(  1  )  processes  into 
their   common  RCA(l)    framework. 

Using  the  terminology  of  Nicholls  and  Quinn  [Ref .  16],  both  the 
NLAR( 1 )   and  the   BELAR(l)   processes   can  be  written  as 


X      =cX      ,-t-BX      ,+e,  (IV.A.1) 

n  n-1  n  n-1  n 


where  {e  }  is  the  i.i.d.  innovation,  E(e  )  =  0,  and  otherwise  defined  as 
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1.  ( 1-a)-Laplace   in  the   BELAR(I)   process; 

2.  standard  Laplace,  but  with  a  degenerate  component   at   the  origin  in 
the   LAR( 1 )    process; 


3.  scaled  Laplace  where   X   =  /  1 -a,    in  the  TLAR(1)   process; 

4.  convex  mixture   of    scaled   Laplace    variables   in  the  general  non- 
boundary  NLAR(I)    process. 


TABLE   IV. A.I 
Summary  of  Models  with  Laplace  Marginals  and  Autocorrelations  of   Y 


Model 


LAR(I) 


NLAR(l) 


BELAR(I) 


TLAR(I) 


Parameter  Values 

T 

Comments 

°'l    = 

1  ;      6^    =      .19215 

.19216 

Linear  models; 

''l    = 

1 ;      6^    =  -.63662 

-.63662 

one  boundary  of 

^'l    = 

1  ;      B^    =     .89986 

.89986 

NLAR(I)    family. 

°'l    = 

6^    =   .43836 

.19216 

General   discrete 

^'l    == 

.797885;      8^    =   -a^ 

-.63662 

random   coefficient 

^'l    = 

B^    =   .94861 

.89986 

model . 

a  = 

.11;      positive  model 

.19216 

General   continuous 

a  = 

.50;     negative  model 

-.63662 

random  coefficient 

a  = 

.884;    positive  model 

.89986 

model . 

°'l    = 

.19216;      B^    =   1 

.19215 

Other   boundary 

°'l    = 

.63662;      B^    =   -1 

-.63562 

model   of  NLAR(1 ) . 

a,    = 

.89986;      B,    =  1 

.89986 

The    {B    }    process    is  the    i.i.d.    random    coefficient    process, 

independent    of    { e    }    and  {X,  }    for   k  ^  n-1    with  E(B   )    =  0  and  otherwise 

n  k                                                 n 

defined  as: 
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1/2  1/2 

1.  ±(A        (a,1-a)    -    Y)  ,    where    Y   =    E{A        (a,1-a)}   and  A   (a,1-a)    is  a 

standard  Beta  random  variable  in  the   BELAR(I)    process; 

2.  0    in    the    LAR(1)    process,    since    it    is    a    linear,    constant 
coefficient  AR(1)   process; 

3.  6.{K'-a}    in   the    other  NLAR(1)   processes,   where  K'    is  a  Bernoulli 

random  variable  such  that   P(K'    =    1)    =  a,    and    0    ^    I  6,  I     ^1    and    a, 

n         1         '  1  '  1 

and  6,  are  not  both  unity.   At  6,  =  ±  1  the  process  is  the  TLAR( 1 ) 
process. 

The  c  is  a  constant  defined  as: 

1  /2 

1.  Y   =   E{A        (a,1-a)}    in  the    BELAR( 1 )    process; 

2.  a    B^    =    6.E(K')    in   all    the    NLAR( 1 )    processes    (ai    =    1    being  the 

LAR(1)    process). 

The  second  and  fourth  moments  of  E  and  the  second,  third  and  fourth 

n 

moments  of  B  are  needed  in  the  following  sections.   In  Table  IV. A. 2, 
n 

there  is  a  convenient  summary  of  the  necessary  equations. 

Now  the  linear  residual,  written  in  terms  from  (IV.A.1)  has  the 
following  forms  analogous  to  (III. E. 4. 3)  and  (III. E. 4.^4), 


R  =  B  X   ,  +  e  ,  (IV. A. 2) 

n    n  n-1    n 

R   =  X  -  cX   ,  .  (IV. A. 3) 

n    n     n-1 


Using  (IV. A. 2)  and  the  independence  of  {B  }  and  [e    },    the  second  and 

n       n 

fourth  moments  of  R  are 

n 
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E(R^)  =   2E(B2)    +   E(e2)  ,  (IV. A. 4) 

n  n  n 

ECR**)  =   24E(B'')    +   12E(B2)E(e2)    +   Eie")  .  (IV. A. 5) 

n  n  n  n  n 


The   variance  of  R^  when  needed  is  derived  from    (IV. A. 4)   and  IV. A. 5). 

n 

B.      RESIDUAL  ANALYSIS    USING  Corr(X'    ,R      ,  ) 

n     n-k 

In   this   section,   the  residual   analysis  using  the  theoretical   cross- 
correlations,   Corr(X%    R  _    )   is  developed.     Using    (IV. A.  1)    and   (IV. A. 2), 

il  11     K 

we  have 


X'    =   c'X'    ,    +   3c^X=^    ,R      +    3cX      ,R^    +   R\  (IV.B.l) 

n  n-1  n-1    n  n-i   n         n 


where  c  is  defined  in  Section  IV. A. 

The   cross-correlation  function  of   X^   and  R      ,    at   lag  k   is  defined  as 

n  n-k 


E(X^R        ) 

C-,  (k)    =   Corr(XSR     ,)=  "  "    "    ,  (IV. B. 2) 

31  n     n-k  ^x'^R 

n     n-k 


where  Var(X^)    =   E(X^)    =  6!    and  Var(R      ,  )    is  given   by    (IV. A. 4)    for    all    n 
n  n  n-k  *  -^ 

and    all    k,    since    as    shown    in   Section   III.E.3,     {R    }    is    stationary 

whenever    {X   }    is. 
n 

Now  from  the  construction  of  R   in  (IV. A. 2),  it  is  explicity  clear 

that  X   and  R   ,  are  dependent  for  all  k  and  that  the  {R  }  are  not 
n      n-k  n 

independent  of  each  other,  unless  B   is  identically  zero  as  in  linear 
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constant   coefficient  AR(1)   processes.     However,  by  the  Residual    Theorem 
(Lawrance  and  Lewis   [Ref.   22]),   the    {R  }   are  uncorrelated. 
From    (IV.B.1),  we  have  for   all   k 


C,,  (k)    =    {c'E(X'    ,R      ,  )    +   Sc^ECX^    ,R   R      ,  )    +   3cE(X      ^R^R      ,  ) 
31  n-1   n-k  n-1   n  n-k  n-1    n  n-k 

+    E(R^R      ,  )}    /    [12/T   {E{R^)}^^^2.  (IV. B. 3) 

n  n-k  n 


Consider    (IV. B. 3)    for   k   <    0.       Since    the   random    coefficients    {B    } 

n 

are  independent  of  the  process  {X.}  for  j  ^  n-1,  this  implies  that 
C  (k)  is  zero  for  k  <  0.  For  k  =  0  in  (IV.B.3),  we  have,  after  some 
simplification, 


72c^E(B2    )    +   6c^E(e^    )    +  72cE(B')    +   E(R'*) 

C_.  (0)    =  2_ " r-75 —.  (IV. B. 4) 

^'  12/5      {E(R^)}'^^ 

n 


For   k  S   1 ,    there  is  the  following  recursive  formula. 


c^(1-c2)E(e2) 

C-,  (k)    =   C_,  (K-1){c^    +   3cE(B2    )    +   E(BM}    +  2y-r    .    (IV.B.5) 

3^  31  n  n  ^^—^^^^,^^1/2 

n 


It  is  now  a  simple  matter  to  consolidate  the  expressions  for  C  (k) 
for  all  k  and  substitute  the  appropriate  expressions  from  Table  IV. A.  2. 
For  the  NLAR( 1 )  models--including  LAR( 1 ) ,  for  which  a,  =  1 ,  and  TLAR(1) 
for  which   6^    =   ±1 — we  have 
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C3^(k)    = 


0, 


10    (1-a^'6^^) 


1  /2 


fa^g^'C^^  (k-1)    + 


/To 


k    <    0; 


,    k    =   0; 


(IV. B. 6) 


k    >    1  . 


For   the   BELAR(I)    process,   we  have 


C3^(k)    = 


0, 

(6    -  5Y^    -   aY") 


3/   10    (1-Y^) 


1/2' 


^  ^^  /   10 


k    <    0; 


k    =   0;       (IV. B. 7) 


k   >    1 


The  theoretical  cross  correlation  functions  for  each  of  the  models 
and  sets  of  parameters  in  Table  IV. A.I  are  given  in  Figures  IV. 8.1  - 
IV. B. 3.  Three  points  can  be  made.  For  the  models  with  |p|  small,  such 
as  in  Figure  IV.B.1,  there  is  little  difference  between  the  cross- 
correlation  functions  of  all  four  models.  (Of  course  for  p  =  0,  there 
is  absolutely  no  difference,  since  all  NLAR(1)  models  and  the  BELAR(I) 
model  collapse  into  the  unique  i.i.d.  case).  A  difference  between  the 
cross-correlation  function  of  the  boundary  NLAR(1)  models — LAR(1)  and 
TLAR( 1 )--does   become  more  apparent   as    I  pi    increases.      But,    there   seems 
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to   be    little   distinction  between  the   cross-correlation  functions  of  X' 

n 

and  R  _      from   the   BELAR(I)   process   and  the  non-boundary  NLAR( 1  )    process 

with   a   =    6-    =   /|  p  I    even   when    |p|     is   large   as   in  Figure  IV. B. 3.      This 
final   point  suggests   the  possibility  that  there  exists  a  pair  of   values, 


(a,  ,8  ),  whose  product  is  p  ?^  0,  for  which  the  BELAR(  1 )  process  and  the 
corresponding  NLAR(1)  process  will  not  only  have  identical 
autocorrelation  functions,  but  may  also  have  nearly  identical  cross- 
correlations    of    X^    and    R       ,     for    some    specified    number     lags 


n-k 


The  cr OSS  - corr el  at i ons  of  X^  and  R   ,  can  also  be  used  to 

n  n-k 

distinguish  the  random  coefficient  AR( 1 )  processes  with  a  standard 
Laplace  marginal  distribution  from  the  Gaussian  AR(1)  process  where 
X  -  N(0,2)  and  e  ~  N { 0 , 2 ( 1  -  a ^ )  } ,  where  a  is  the  correlation 
coefficient.     We  have  for   the  Gaussian  AR( 1 )   models, 


03^  (k)    =   Corr(X3.R^_^)    = 


(3/5(1-a^) 


a'^c^^co; 


1/2 


k    <   -1, 
k    =  0, 
k   >    1. 


(IV. B. 8) 


Note  that  is  C^,(k)  for  k  ^  1  is  proportional  to  Corr(X  ,X  ,  )  =  a  . 
This  is  consistent  with  the  fact  that  a  Gaussian  process  is  completely 
determined  by  the  mean   and  covariance  matrix. 

Figures  IV. B. 4  -  IV. B. 6  show  the  theoretical  cross-correlation 
function  of  the  Gaussian  AR( 1 )  model  superimposed  on  the  values  for  the 
different    models    from  Figures   IV. B.I    -   IV. B. 3,    which  have   the  standard 
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Laplace  marginal   distribution.      There  is    some    differentiation    between 

the    Laplace   models    with   AR( 1 )    correlation   structure    and    the    given 

Gaussian  AR( 1 )   model,   but   not  much.      It  would,   however,   be   very   easy   to 

identify   the    Gaussian   model    from   the  Laplace  models  using  probability 

plots.       This    illustrates    the    point   made    at    the    beginning    of    this 

chapter,    that    a   higher-order   residual    analysis    is   not    intended   to 

replace  the  existing  methods  of  analysis.      It  also  emphasizes   one  of   the 

very  foundations  of   the  thesis,   that  the  marginal   distribution  is  one  of 

the   very  first  aspects   of   a  time  series   that  should  be  examined. 

C.      RESIDUAL  ANALYSIS    USING  CorrCR^R^    ,  ) 

n  n-k 

In  this  section,  the  residual  analysis  using  the  theoretical 

2      d2 

n     n-k 


autocorrelations,   Corr(R^,R^   ,)    is  developed. 


Let  C      (k)   represent  Corr(R^,R^_    )    for   all    k.      Since  the   correlation 
function   is    an    even    function  and   {R  }    is  stationary,   C„„(k)    =  Cpp(-k). 

We  represent   only  k   =  0,1,2 Using    (IV. A. 2),    we    have    after    some 

simplification  for   k  ^    1, 


n     n-k 


=   [£{(8^X2    .    +   2B^X      .£      +    e^)R^    ,  }    -  E(R^)E(R2    J]    /    ia^^o^^      ) 
n  n-1  n  n-1    n  n     n-k  n  n-k  R     R      , 

n     n-k 


n     n-k 


(E(B^)    Cov(X=.,,R^.^))    /    (0^,0^,      ) 

n     n-k 


235 


=   E(B^)    Cov(X^,R^_^j^_^^)    /   Var(R^).  (IV. C.I) 


Now  an  immediate   advantage    to   the    analysis    based  on    (IV. C.I)    as 

opposed   to   that    based  on  Corr(X',    R  _    )   is  apparent.      For  the  constant 

coefficient  models,    LAR(1),    Cp_(k)    will    have    a  spike    at    lag-0   and   be 

zero   for    all    other    lags,    since    B     =0.      This  will   not   be  the   case  for 

n 

the   other    NLAR( 1 )    random    coefficient    processes    or    in   the    BELAR(  1  ) 

process.      It  will   not,  however,   distinguish  the  LAR(1)   process  from  any 

linear  AR(1)   process.      This,  however,   can  be  achieved  using  probability 

plots   as  mentioned  previously. 

To  derive   a  computational    formula   from    (IV. C.I.)    in   terms    of    the 

parameters    of    the    process,    first    let    E^^(k)    =    E(X^R^    ,  ).      Then, 

22  n    n-k 

substituting    in    (IV. A.I)    and    ( IV . A .  2 )  ,     we    have,     after     some 
simplification  for   k   =  0, 


E_-(0)    =   E{(cX      .    +   B   X      .    +    e    )HB  X        +    e    )^} 
22  n-1  n  n-1  n  n  n-1        n 

=   E(e'')    +   2cE(e2)    +   12E(e2)E(B2)    +24c2E(B2) 
n  n  n  n  n 

+   48cE(B')    +   24E(B**).  (IV. C. 2) 

n  n 


For   k  ^    1  ,    we  have   the  recursion 


E22(l<)    =    (c^    +   E(B^)}E22(k-1)    +   E{e^)E{R^_^)  .  (IV. C. 3) 


Again  using  the  stationarity   of    {X    }    and    {R^},    we   have    the    following 
expression  for   the   autocorrelation  function 
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0^2  (k)    = 


1, 


E(B2) 
n 


2E(R^)}, 


k    =    0; 


k   >    1. 


(IV. C. 4) 


For    the   non-LAR(l)   cases  of   the  NLAR(1)   process,   we  substitute  from 
Table  IV. A. 2  and  equations    (IV. A. 4)    and   (IV. A. 5)    to  obtain 


E^^CM    = 


M6    -  a^^6^^(5+126^-11a^Bp},  k    =  0; 

.a^6jE22(K-1)    +   4(  1 -a^  B^  ( 1-a^'Bp  ,  k    >    1. 


(IV. C. 5) 


C22(l<)    = 


1,  k    =  0; 

.a^(1-a^)B^{E22(k-1)    -   Ml-a^^Bpl  r   >    i. 


4{5-a^B^(4    +   24B^    -   42a^B^    +   19a^S^' 


(IV. C. 6) 


The   corresponding  results   for   the   BELAR(I)    process   are 


E22(k)    - 


12(2+aT^-3Y2) , 


aE22(k-1)    +   Ml-a)(1-Y2), 


k    =   0; 
k   >    1  . 


(IV. C. 7) 


C22(k)    = 


1, 


(a-T^){E22(k-l)    -   M1-T")} 


4(5-12Y2+26aY^-19T'*) 


k    =   0; 
k    >    1  . 


(IV. C. 8) 


The  theoretical   autocorrelation  functions  for   each  of   the  models  and 

sets   of   parameters   in  Table  IV. A.I    are  given   in  Figures   IV. C. 1-3.      There 

appears   to  be  more   discrimination   between    the    TLAR(l)    model    and    the 

other    random    coefficient    models    with    Corr(R^,R^    ,  )    than    with 

n       n-  k 

Corr(X^,    R  _,  ),    even  when    |pl     is    small,    as    seen    in    comparing    Figures 
r  1         ri""  K. 
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IV. C.I    and    IV. B.I.       There    is    still    little  discrimination  between  the 

BELAR(I)   model   and  the  given  non-boundary  NLAR( 1 )    model.      However,    the 

important    point   is  that  since  the  LAR(1)   model    is  a  linear  AR(1)   model, 

Corr(R^,R^   ,)      =0  for   all   values   of   p  and  for   all   k   =   ±1,    ±2, 

n     n-k  ^ 
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V.      EXTENSIONS   AND  OPEN  PROBLEMS 

A.  INTRODUCTION 

During  the  discussion  in  the  previous  chapters,  possible  extensions 
and/or  unresolved  issues  have  been  mentioned.  At  this  point,  we  con- 
clude by  summarizing  some  of  the  directions  in  which  this  research  could 
be  continued.  There  are  still  significant  contributions  to  be  made, 
particularly  in  parameter  estimation,  model  development  and 
applications . 

B.  ESTIMATION 

There  are  several  open  questions  and  extensions  in  the  area  of 
parameter  estimation  and  inference  for  this  class  of  stochastic 
processes. 

First,  there  is  the  need  to  obtain  theoretical  results  substantiat- 
ing the  empirical  results  from  the  simulation  of  the  maximum  likelihood 
estimator  (m.l.e.)  of  serial  correlation  in  the  TLAR(l)  and  the  BELAR( 1 ) 
processes.  Several  researchers  have  written  on  the  subject  of  maximum 
likelihood  estimation  in  dependent  sequences.  Much  of  this  is  assembled 
in  the  books  by  Basawa  and  Prakasa  Rao  [Ref .  ^2]  and  by  Basawa  and  Scott 
[Ref .  43].  It  is  not  known  whether  the  conditions  on  the  conditional 
densities  are  satisfied  in  the  cases  of  these  random  coefficient  AR(1) 
models  to  prove  that  the  m.l.e.  is  consistent,  asymptotically  efficient 
or    asymptotically   Normal.     Conditions  for  the  existence  of   the  m.l.e's 
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are  generally  extremely  complicated  and  difficult  to  verify  unless  the 
log  likelihood  is  absolutely  continuous   in  the  parameter  space. 

A  second  problem  to  resolve  is  that  of  existence  and  uniqueness  of 
the  maximum  likelihood  estimators  of  (a  ,3.)  in  the  NLAR(1)  process.  In 
this  case,  the  log  likelihood  is  definitely  not  dif f erenti able  with 
respect  to  the  parameter,  6.,  nor  is  it  clear  that  there  is  a  unique 
maximum.  It  appears  from  contour  plots  of  the  log-likelihood  function 
over  a  grid  of  values  in  (a.,B.)  coordinates  that  there  is  a  unique 
local  optimum  within  the  region  bounded  by  0  <  a,  <  1  and  -1  <  6^  <  1 
for  large  enough  samples  of  {X  }.  A  non-linear  optimization  technique 
that  uses  only  function  values  and  not  derivatives  seems  to  be 
appropriate,  since  the  log-likelihood  function  is  not  dif ferentiable 
everywhere  with  respect   to   B   . 

A  third  problem  involves   the   2, -Beta-Laplace  ARC  1 )   model.      Except  for 

the    case   when    i    is    assumed    to    be    one    (the    BELAR(I)    model),    the 

likelihood  function  in    (a, 8,)   has   not   been   derived.      This  is  primarily  a 

numerical    issue  since  neither  the   density  of   X     for    non-integer    values 

of    i   nor    the    conditional    density   of    X      given   X      ,    for   any  values   of 

n    ^  n-1  ■' 

S,  >   0  have   closed-form  expressions. 

A  fourth  issue  in  estimation  is  to  extend  the  maximum  likelihood 
approach  to  include  the  joint  estimation  of  the  scale  parameter  of  the 
marginal  distribution  to  that  of  the  shape  parameter  and  the  serial 
correlation  coefficient.  There  is  no  reason  to  assume  that  the  marginal 
distribution  should  always   be   a  standard  Laplace  or  standard   il-Laplace. 
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Finally,  there  is  the  issue  of  quantile  estimation  in  the  random 
coefficient  models.  Empirical  results  are  given  only  for  the  BELAR(I) 
process  for  the  distribution  of  the  sample  median.  Theoretical  results 
are  related  to  mixing  conditions.  Based  on  a  new  mixing  condition, 
which  has  been  shown  to  be  satisfied  by  linear  AR(1)  processes 
[Ref  .  44],  Gastwirth  and  Rubin  derived  the  asymptotic  Normal  theory  of 
quantile  estimation  for  the  linear  LAR(1)  process.  The  open  question  is 
whether  the  mixing  condition  of  Gastwirth  and  Rubin  is  satisfied  by  any 
of  the  random  coefficient  models — NLAR(  1  )  ,  BELAR(  1  )  or  Jl-Beta-Laplace 
AR(1)  . 
C.      MODEL   DEVELOPMENT 

Advances  in  modelling  can  be  made  in  developing  scalar  models  with 
p-th  order  autocorrelation  structure,  as  well  as  bivariate  autoregres- 
sive  models. 

An  open  question  in  the  development  of  the  NLARMA(p,q)  family  of 
models  is  the  existence  of  the  general  class  of  models  with  p-th  order 
autocorrelation  structure — NLAR(p)  for  p  ^  3;  specifically,  it  is  to 
derive  the  distribution  of  the  i.i.d.  innovation  sequence  {e  }.  This  is 
only  known  for   the  TLAR(p)   subclass   of   a  proposed  NLAR(p)   family. 

A  similar  question  is  open  for  p  ^  2  in  the  continuous  random 
coefficient  models  with  an  H-Laplace  marginal  distribution.  The  actual 
structure  of  the  model,  as  well  as  the  distribution  of  the  innovation  is 
in  question. 

There  is  also  a  need  for  multivariate  time  series  in  many  fields  of 
physical   science.      The  NEAR(2)    framework   was    used    by    Dewald   and   Lewis 
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[  Ref .    2M]    to    derive    a    bivariate    exponential    AR( 1 )    model.      Such    an 

extension   is    also    possible  with   the    NLAR(2)    model.       Just    how    one 

estimates    the   eight    possible   parameters    in   such   a   model    is    an  open 

question. 

Related   to   the   model    development    and   parameter   estimation  is  the 

need  to  identify  particular  models.      Higher   order  residual   analyses   have 

been    based  on  the  linear  residual   R     =   X     -  a.X      ,    -  a^X     _.     Since  the 

n  n  1    n-1  2  n-2 

NLAR(2)   model    is  only  partially  time  reversible,    it  is  possible  that   the 

reversed   residual    R      =    X      -    a,X      ,    -    a^X      ,    could    be    used    in  model 

n  n  1    n+1  2   n+1 

identification  as   well.     These   were    introduced    by    Lawrance    and   Lewis 

[Ref.   6,    45]  but   their  use  has   not   been  explored  in  any  context. 

There  is  also  the   question  of   the  effect   that    estimating   a      and   a^ 

from    {X    }    will   have  on  the  sample  autocorrelation  of    (R^.R^      ,  )   and  the 
n  ^  n       n-k 

cross-correlation  of    (X^.R      ,  )    in   the    fourth-order    residual    analyses 

n     n-k 

proposed  in  Chapter   IV. 
D.       APPLICATIONS 

In  Chapter  I,  several  areas  have  been  noted  where  the  modelling  is 
accomplished  with  heavy-tailed  distributions,  notably  in  voice  and 
acoustics  modelling,  as  well  as  in  image  coding.  In  these  areas,  the 
Laplace  distribution  and  the  symmetric  Gamma  distributions  are  widely 
used.  There  is  the  possibility  that  the  2,-Laplace  for  i  <  1  could  also 
be  a  useful  alternative  to  the  symmetric  Gamma.  One  advantage  of  the 
Z-Laplace  distribution,  which  is  the  difference  of  two  i.i.d.  Gamma(!,,X) 
is  the  simplicity  of   the  form  of   the   characteristic  function. 
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Another  field  in  which  the  2, -Laplace  models  could  be  useful  is  in 
the  modelling  of  the  directional  components  of  wind  speed.  Models  with 
skewed  marginal  distributions  have  been  fitted  to  data  and  then 
transformed  either  to  Normals  (for  example  by  Brown,  Katz  and  Murphy 
[Ref.  46]),  or  to  Exponentials  by  Lawrance  and  Lewis  [Ref.  6].  In  both 
of  the  cited  papers,  the  data  indicated  that  the  wind  was  almost  always 
blowing.  The  question  is,  however,  how  does  one  model  wind  velocity 
when  there  are  long  calm  periods.  This  is  a  problem  from  Australia  as 
related  by  T.  Lewis  in  the  discussion  of  the  NEAR(2)  model  [Ref.  6].  As 
can  be  seen  in  Figure  III. C.I,  for  small  values  of  I,  highly  correlated 
periods  of  calm  and  wind  can  be  generated  using  the  2, -Beta-Laplace  AR(1) 
model . 

The  preceding  examples  demonstrate  the  opportunities  for  continued 
research  and  are  not   intended  to  narrow  the  focus   of   future  endeavors. 
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VI.      SUMMARY   AND   CONCLUSIONS 

We  have  indicated  by  reference  to  the  scientific  literature  that  there 
are  important  application  areas,  especially  in  the  physical  sciences  of  time 
series  whose  marginal  distributions  are  non-Normal.  This  feature,  itself, 
presents   new  problems  in  the  modelling,   study  and  analysis. 

For  those  areas  where  the  non-Normality  manifests  itself  primarily  in 
the  thickness  (heaviness)  of  the  tails  of  the  marginal  distributions,  we 
have  demonstrated  that  within  the  S,-Laplace  family  of  distributions,  there 
is  an  appropriate  member  with  which  to  model  phenomenon  with  a  symmetric 
heavy-tailed  marginal  distribution.  The  H-Laplace  family  has  very  thick 
tails  when   i   is  small   and  a  limiting  Normal   distribution  as   I   increases. 

To  account  for  serial  dependence  in  the  time  series  we  have  derived  two 
families  of  random  processes  that  extends  the  random  coefficient  approach  to 
modelling  non-Normal  time  series.  The  discrete  random  coefficient  models 
(NLARMA(p  ,q ) )  have  a  Laplace  marginal  distribution  and  the  continuous  random 
coefficient  models  ( !l-Beta-Laplace  AR(  1  )  and  MA(q))  have  an  2,-Laplace 
marginal  distribution.  Both  families  are  additive  models  and  imitate  the 
linear  Gaussian  models  in  that  they  exhibit  the  usual  ARMA(p,q)  correlation 
structure.  The  models  are  parametrically  parsimonious,  structurally  simple 
and  easy  to  generate  on  a  computer. 

We  have  also  demonstrated  that  the  fourth-order  residual  analyses  based 
on  the  uncorrelated,  but  dependent  sequence  {R  }  are  appropriate  and  useful 
methods   to  discriminate    between    the    discrete   random    coefficient    and    the 
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continuous  random  coefficient  models  when  first,  second  and  third-order 
properties   are  identical. 

For  the  purposes  of  parameter  estimation,  we  derived  the  joint 
probability  density  function.  Numerical  routines  were  written  to  maximize 
the  likelihood  function  to  estimate  the  serial  correlation  coefficient  in 
the  BELAR(I)  and  the  TLAR( 1 )  processes.  Simulation  results  indicated  that 
this  estimator  was  more  efficient  and  less  biased  than  the  least  squares 
estimator  derived  from  the  linear  residual. 

Finally,  we  summarized  some  of  the  remaining  issues  in  this  field  of 
non-Normal  time  series  analysis.  Extensions  of  the  analyses  in  this  thesis 
which  need  to  be  pursued  are  noted,  along  with  possible  applications  in 
those  previously  mentioned  fields   of   the  physical   sciences. 
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